
Writing clean, isolated and efficient unit test is often a challenge for developers. Efficient test should cover all the possible business scenarios. To create test for covering multiple test scenarios, you need more test data.
For instance, imagine you are writing a test for some Service component. Now if the responsibility of this service is to just collect some data from DAO Layer and pass it on to Business Delegate, life is easy. You can create a mock for DAO using frameworks like EasyMock and you are done. But that’s not often the case. For testing services with complex business logic, its not sufficient to return dummy data. In this case, we create the expected test data and mock the DAO to return the expected data. If this input seed is a simple object, its few additional lines of code and we are done. But what if the test is dealing with complex data model? Normal practice that I observed amongst developers is some private methods are created deep down the test to generate test data – generateXXX().
Read more…
Have you ever spent days rewriting the whole application burning the midnight oil? Well, I don’t think you are the only one. In early days of my career I also made similar mistakes or was a victim of mistakes made by fellow developers. Having taken this roller coaster ride, spending sleepless nights fixing the code, I learned a new mantra – Tune Early Tune Often. In most of the enterprise development cycles, performance testing and tuning is done pretty late. Typically we start worrying about it during system testing and by the time (if at all) application goes into UAT phase, performance becomes the critical requirement. So if we are building applications where performance is as important as any other functional requirements, why not invest some time addressing it in early development phase?
Read more…
Complex Event Processing (CEP) is a buzzword that’s been running around the industry for last couple of years. The concept was introduced by David Luckham of Stanford University who has done over a decade of research in this field. Let’s try to understand some terms that are used frequently in this arena.
An Event is a piece of data that represents that something happened in the real world, or in software system. Events often observe a change in state. e.g. a stock tick or a password change. A linearly ordered sequence of events forms Event Stream. While a partially ordered set of events form Event Cloud. So an event stream could be a cloud but the reverse need not be true.
e.g. Set of all stock trades for GOOG within a 5 minute time window is an Event Stream. While all Stocks sold in a business day is an Event Cloud. And above event stream could be a part of this event cloud.
Read more…
Spring source community is coming up with spring batch 2.0 in Q2-2009. Spring batch is the first java based framework for batch processing. I think the decades of experience of Accenture in enterprise batch processing really helped for defining the use cases.
Most of the batch applications need to process high volume business critical transactional data. While doing so some set of non functional requirements (NFR) are sort of mandatory in such applications. These NFRs include performance, scalability, restartability, repeatability. I worked with couple of investment banks and my experience says that such batch applications are developed based on either Messaging model or Multi-threading model. Lot of efforts and time is spent by architects, developers and testers in building this robust infrastructure for batch processing. Also we cannot overlook the cost involved. Whenever you move across projects, you end up creating your own batch processing framework. Sigh!!
Some nice features that are introduced in Spring Batch 2.0 are conditional step execution, finer metadata access control and chunk based processing. To perform chunk based processing, we need to configure the commit-interval in a step. The transaction is committed after number of items specified in commit-interval are processed.
<step id="step1" job-repository="jobRepository" transaction-manager="transactionManager">
<tasklet reader="itemReader" writer="itemWriter" commit-interval="10"/>
</step>
Given the features provided and use cases handled by Spring batch, it can prove to be the de-facto framework for enterprise batch applications.
In one of my projects I did some work on process improvements. The main target was to set up independent developer workstation. We stubbed out all the interfaces. Earlier we tried to share a development database in a group of 4-5 developers. But believe me, I’ve seen developers throwing coffee cups when someone touches their data. Imagine you spending couple of precious hours out of your day on analysing the data and then getting it in right state to run your test cases and the moment before you hit “Run”, someone drops in and mess up all your data. How insane!!
Well the answer is simple, give me my own database and save yourself from the typical excuses. I evaluated couple of open source databases like MySQL, Postgres, EnterpriseDB and finally settled on OracleXE. Well we use Oracle in production and hence its easy to setup development environment with OracleXE with almost no extra tweaks in the scripts. I used DBUnit for testing database modules. It works quite well, but I think my strategy to setup test data was not optimum. It takes bit longer to run the tests now. I used Dumbster as a fake SMTP server and XStream for creating enriched objects for my tests.
Also one useful tool I would like to share here is DBMonster. It really helped me for generating mass random test data. Now I don’t have to request the DBA’ to give me test data. Well its not that simple also, as you need to understand the states of the dataset to create the seed xml files. But I think its still better than having human dependencies.
Happy Development
!!