daily post

Test data for Big Data projects

By |2020-02-06T19:12:00+05:00February 6th, 2020|daily post|

Got a great question in my talk’s Q&A at the #AutomationGuild 2020 :

How to create test data for #BigData projects?

Generally there are three types of ‘test data management’ you want to focus,

1. Mocks / stubs
2. Generate synthetic test data
3. Masked production data

For big data, the most important one is masked production data.

You would also need to create synthetic data too, but will not be enough to see if the model is working properly or not

So make an effort to get masked production data to have greater confidence in your data pipeline and data models.

#QsDaily #BigData #TestData #Testing

Disliking long written test cases

By |2020-02-01T19:44:19+05:00February 1st, 2020|daily post|

Over the years I’ve started to dislike writing formal and long test cases

We need test cases, but written in a better way..

Traditionally we write them detailing each step along the way,

The idea was even if someone is not a domain expert can use them, or to make sure we don’t miss a step and know what exactly was tested

These monster documents become shackles: time consuming to write, a nightmare to update

Instead I prefer writing the ‘test scenario’ in brief with the most important check / validation over a full blown detailed list of steps

Also, importantly, using mind maps instead of test case management tools / documents

One does sacrifice the details this way, but makes things much quicker

Also provides a great holistic view of the many test types / scenarios

#QsDaily #Testing #TestCaseWriting #LeanThinking

Automation Engineers to follow 2020

By |2020-01-30T20:21:10+05:00January 30th, 2020|daily post|

I’ve always believed living a life bigger than yourself.

Thank you Joe Colantonio ⚙️for adding me to the top 28 Test automation engineers to follow in 2020.

My vision since 2013 has been “Redefining Software Quality”

I believe testers and testing practices need to evolve considerably to deliver value to businesses

Instead of trying to shoe horn ‘our way’ of how we do things into what people do,

We should develop skills and practices that support the industry’s needs of today.

IMHO there are three values which will help us get there:

– Technological Excellence

– Testing Acumen

– Business Value

#RedefiningSoftwareQuality #RSQ #Automation #Testing #TestersGoingTechnical

Three types of test data to manage

By |2020-01-28T20:37:23+05:00January 28th, 2020|daily post|

Test data management for automation is not just adding some fields in cucumber…

Test data management can include:

– Developing & using mocks for running low level tests in the pipeline

– Data creation & cleaning on the fly (more here: http://bit.ly/QS_tdc)

– Using masked production data & refresh it on a regular cadence (for exploratory / behavioral tests)

For all three, the data generation, storage and usage techniques are going to be very different.

And if you are working in a large enterprise, you most probably will need ALL THREE


No one said automation is easy..

#QsDaily #Automation #TestData

Data quality across the pipeline

By |2020-01-26T20:08:23+05:00January 26th, 2020|daily post|

Before any analytics can run on data, sometimes a number of ETL (Extract, Transform, Load) processes happen.

Data might pass through a series of data engineering / ETL processes to make it fit for purpose and categorized as needed

Quality across this process is all about ‘Data Quality.

I’ll explain the concept further and talk at length about how to check data quality in the upcoming #AutomationGuild


#QsDaily #Automation #BigData #DataPipelines

Develop your API Contracts

By |2020-01-22T20:16:13+05:00January 22nd, 2020|daily post|

I know implementing the automation pyramid is hard,

And to a large extend, is not a problem with just testing practices either..

I’ve seen teams where products don’t have contracts written up properly (API contracts / JSON schemas)

Back end services are not designed for anyone other than the developer to consume..

Such places do make it hard to implement the pyramid, i.e. 70 – 80% tests at the back end,

The solution: The whole team work on developing those contracts, and then write tests for those contracts..

Easier said than done, but until you don’t have that, testing isn’t going anywhere.

#QsDaily #Automation #Testing #ApiAutomation #AutomationPyramid

Change is the only constant

By |2020-01-30T20:26:52+05:00January 21st, 2020|daily post|

The only constant is change, especially in tech, regardless what role you play.

Similarly, dear testers, we need to change and adapt.

Change is uncomfortable and scary at best, but not a good enough reason not to change.

There is no choice but to work on ourselves “throughout our lives”! Doesn’t matter what occupation you are in.

The next time you are exposed to a new technology, language, testing technique, lean ways of working, don’t hold back.

Pushing yourself never comes naturally to ANYONE

I still ‘train’ my mind to push through my fears. And when I stop training it, I start to give into fear.

To train your mind and subconscious, one ritual can be to listen to motivational content, these days it’s Les Brown for me (linking sample video below).

#QsDaily #TechnologicalExcellence #Motivation #TestersGoingTechnical

Comments Off on Change is the only constant

Do NOT Automate regression 100%

By |2020-01-20T20:24:38+05:00January 20th, 2020|daily post|

Automation regression percentage – The metric I hate the most..

Many times the only use of this metric is to provide false assurance that we are efficient in testing.

And the ultimate goal soon becomes to automate regression 100%, which is a bad idea,

And automating just UI tests makes it even worse.

More on why not to use it and what should be done in the linked video

#RedefiningSoftwareQuality #Automation #RegressionTesting #KPIs

Data quality quick list

By |2020-01-19T20:40:08+05:00January 19th, 2020|daily post|

Data quality is one of the biggest problems with data science projects,

I’ll be talking about these at the #AutomationGuild, here’s a quick list:

– Accuracy. Is the data accurate in the context to be used

– Validity. Is the data fresh enough, still valid?

– Consistency. Data from different sources / time frames matches

– Completeness. No parts of data are truncated / missing

– Uniqueness. Enough data to uniquely identify records

– Timeliness. Data being collected at the right time & processed in a timely fashion (efficient enough)

More on the conference here:


#QsDaily #BigData #DataScience #Testing

Test data and masking

By |2020-01-18T20:34:00+05:00January 18th, 2020|daily post|

For large enterprises with many interconnected applications, generating test data can be a challenge.

Even more so for big data projects, which is where masking comes in..

Take a copy of production DB from different products around the same timeline and mask the data for GDPR compliance.

Surely easier said than done, but can be very effective.

IMHO three main requirements to build this:

1. A quick guideline to identify what data will be classified as ‘sensitive’ and ensure it’s GDPR compliant.

2. A masking platform which masks value the same way across, helping with consistent data

3. Creation and usage of test environments is efficient and fit for purpose

#RedefiningSoftwareQuality #TestData #Automation