Ali Khalid, Author at Quality Spectrum

What Autonomy does and does NOT mean

By Ali Khalid|2020-11-10T11:17:30+05:00November 4th, 2020|daily post|

Autonomy is a cornerstone of agile transformation. However I’ve seen it taken the wrong way too.

IMHO, autonomy means:

– Self organizing teams who don’t have to wait for someone to take decisions on their behalf

– Who have the autonomy to estimate work (within reasonable guidelines) and drive decisions related to completing the work

– As they can take technical decisions, so are they responsible for those solutions – you code it, you own it

What it does NOT mean:

– Create a bubble and make sure no one from the outside has any visibility into the working of the team (no one can come to our stand ups / retros or other ceremonies). Agile is about transparency remember!

– Take autonomy as a ‘do whatever and get away with badge’ – for example, give unrealistic estimations and having no technical reasoning for why it’s going to take that long.

A big factor in all this I guess is having a purpose and being driven also.

And make no mistake – this will have a direct impact on quality of your product – and hence should be factored into the KPIs to measure.

What is your experience on autonomy in teams?

0

Testing AI based systems

By Ali Khalid|2020-10-31T15:13:02+05:00October 31st, 2020|Uncategorized|

This year’s QA&TEST conference 2020 was another great addition. While planning the event there were discussions on if the event is supposed to be in person or online, turns out doing it online was definitely the right choice. I did miss the traditional Bilbao food from last year and an amazing experience when I was there as a speaker, the online event this year was also well done too.

The AI track of the event had a talk by Parveen Kumar and a panel discussion with a quick 10-minute talk by all panelists followed by discussions on testing AI systems. Had a blast with the panelists, we all were from different backgrounds but surprisingly had very similar thoughts and challenges on talking about testing AI products. Thoroughly enjoyed the talks and presentations and gave me some new insights too which I want to share in this article.

Testing AI starts with data

Folks when thinking about testing AI systems start debating how to get into that black box neural network to figure out what’s going on and how to test it? The challenge off course, these are evolving programs which no one directly controls. In case of machine learning algorithms, the biggest factor is the data used to train the model – and IMHO that’s where testing should start from.

Before loads of data can be fed into the models, all that data needs to be gathered, cleansed and modelled. In my segment I talked about some fundamental concepts like data pipelines, data quality across these pipelines and some quality checks to look out for across different stages.

What’s a data pipeline?

All this usually starts with ‘Big’ data, which means the variety of data, speed of processing and size of data plays a big factor. From gathering this data till feeding it to an AI model, data passes through lots of stages. On a high level we classify them as:

Data ingestion – Getting data from different sources
Data lake & data warehouse – Creating data / domain models and normalized tables
Analytics – Creating semantic models and running analytics / or feed data into machine learning algorithms

As data is processed through this pipeline, the quality of data has to be measured to ensure as an end output we are able to provide clean data. One of the techniques to do this is data quality dimensions. There are 6 different attributes / characteristics (dimensions) which any data set should conform to. Measuring your data for these dimensions helps to analyze if the data is accurate and fir for purpose.

Stages across the data pipeline are curating data with different objectives, therefore the quality dimensions to look out for are also different. While this is a very detailed subject and I usually do a complete talk just on this, the illustrates below summarizes some examples:

Interesting insights

The talks and questions during the panel discussion unearthed some interesting points which I feel might be very helpful for teams exploring to test AI systems.

Regulatory requirements

For safety critical devices regulatory bodies provide practices, processes and guidelines governing how safety approvals will be given. With AI products the community is debating what is more practical and pragmatic approach to certify AI systems.

Due to evolving nature of AI products, it is possible the guidelines will be more process based rather than around the product’s functionality itself since those are going to be a moving target. It goes without saying this is a very complicated problem to solve and the stakes are high. Take an example of self-driving cars and it’s impact.

Continuous learning algorithms

In certain ML models, like deep learning, are mostly ever evolving. After the product is released, it still keeps learning and changing it’s algorithm. This poses a different set of challenges and the traditional test and release cycles are not enough. Observability and production tests become a must in such situations, which means testing becomes an ongoing activity happening in production.

Biases in AI products

AI models build their intelligence through machine learning by consuming large amounts of training data. The kind of data we provide is going to govern the kind of assumptions the model makes. In recent years a few incidences have surfaced where the AI model was biased to a certain group of people or other variables.

The challenge, many times we don’t even know if a bias exists. For instance, a leading tech company had an AI program to short list resumes. Later it was discovered the program assumed highly skilled people are usually male candidates. This was perhaps due to the training data it had, since most candidates who got job offers were men, it made that assumption. Even after knowing the problems it was very hard to fix it, and eventually they had to stop using the said AI program!

The most evident solution is first to figure out any biases that may exist before the training data is provided. The challenge is off course knowing about those biases. What can also help is giving a very wide range of data. Also train & test on every different data sets. This can highlight any incorrect assumptions and biases that might have been built / inferred from the training data set.

Standardizing building and testing AI systems

While regulators and the wider community is looking for ways to have some baseline practices, however the use cases of AI are very widespread and validating the underline algorithm is challenging, it’s going to be hard to reach some generic product-based validations.

Having said that, there are some techniques which can help look at these solutions from a different paradigm, which can be used as good techniques to identify potential risks in such systems. One such technique is STPA [link] which suggests an alternative top down approach of looking at holistic systems and focusing just on safety instead of focusing on the system’s functionality.

Challenges ahead

The field of AI is exciting and has lots of applications. By now we are already seeing many products started to use AI in some capacity. This is going to ever increase because of AI’s capability to process multi-dimension factors and process large amounts of data which can be hard for humans to do.

Apart from topics discussed above, the key challenge IMHO is going to be lack of skills. Looking after the quality aspect is going to be even more challenging, these systems needs engineers who have technical insights into how the underlying technology works plus have ‘testing Acumen’ to test well. While this has always been a problem, seems with AI systems & Big data projects this will be an even bigger one.

0

2 minute intro to Data pipelines

By Ali Khalid|2020-10-28T20:11:32+05:00October 27th, 2020|daily post|

Introduction to data pipelines in 2 minutes:

For analytics, AI products lots of data is needed – commonly dubbed as Big data

To get the data needed, and in a usable format – we need to pass it through a lot of different data processing stages, a collection of which can be called data pipeline.

Data pipelines have 3 main stages:
– Ingestion: Gather data from various sources in different formats
– Data hub & warehouse – Cleanse, model data at different stages
– Analytics – Run analytics or use specific data sets for machine learning algorithms

Join me at TestBash New Zealand Online 2020 where I’ll talk about a lot more around testing in big data projects!

0

Testing AI products

By Ali Khalid|2020-10-26T21:39:29+05:00October 26th, 2020|daily post|

How would you go about testing an AI product? Join me at QA&TEST Conferences where we will debate the topic on Oct 30.

I’ve been fortunate to work on that problem a few times in past years, in my experience this is not a straight forward answer..

Biggest reason for me – it’s not easy to establish clear oracles (I didn’t use the word requirements..) which creates a complete new dynamics to test these products.

The test objectives of these systems are therefore going to be a bit different, and wouldn’t be just about testing the actual model,

I’d stretch it from the beginning – ‘capturing of data’ and then take to the other extreme – ‘is the model predictions good enough’ – not an easy answer to give, but worth investigating.

0

How to & how NOT to write test cases

By Ali Khalid|2020-10-20T19:39:16+05:00October 20th, 2020|daily post|

Anti-agile test case writing:
– Extremely lengthy test cases for every scenario anyone has ever thought of, and make sure not to rank them
– Then do a ‘regression testing’ cycle and try execute all of them
– Then try to code ALL the written tests as automated scripts

Test case writing the Agile way:
– Collaborate to identify acceptance tests in Three amigo sessions – these are highest priority scenarios
– Automate tests across the tech stack – and DON’T document them – they exist in your code already
– Write checklists to serve as heuristics / references to do exploratory testing

0

Reminders for PI planning

By Ali Khalid|2020-10-20T19:46:37+05:00October 19th, 2020|daily post|

Some reminders for me after being part of Program Increments (PI) since past week..

1. Have a rough estimate of your capacity plans prepared before hand. That includes public holidays, annual leaves

2. Have your features and related stories defined and get the roughly prioritized before planning meetings

3. PI plans (typically across 6 sprints / 3 months) are ‘plans’ – not to be set in stone and can fluctuate over course of the PI

4. Mark your dependencies and get them prioritized – especially critical dependencies

5. Above all – PI planning is about collaboration and figuring out where you time is best spent across next 3 months – to deliver value to end customers

About:

PI planning is a ceremony in scaled agile framework for release trains (group of scrum teams) to plan for the next program increment

0

Leading and Lagging indicators

By Ali Khalid|2020-10-18T19:57:55+05:00October 18th, 2020|daily post|

While defining KPIs, have both – Leading and lagging indicators.

Often I’ve seen people not making that distinction which creates misconceptions about what the KPIs are saying

Leading indicators:

Predictive measurements used to influence change. For example – how good are we at creating & working with user stories.

These are not the actual change we want to see, rather the change which will ‘lead’ to the outcomes we are hoping for

The problem I see – folks take these as the ‘ultimate’ outcome to measure, which is the problem

Same goes for automation – an common one is ‘% tests automated’. This ‘might’ be a leading indicator in some situations – but is definitely not the ultimate goal.

Lagging indicators:

Measuring the outcome we are actually looking for. For transformation projects could be e.g. 50% reduction in bugs from the field, 40% reduction in lead time

So next time defining KPIs – do classify them as leading and lagging indicators.

Both are helpful – but be careful not to measure a leading indicator assuming these are the ultimate objective.

0

Algorithm design aptitude

By Ali Khalid|2020-10-16T18:18:55+05:00October 16th, 2020|daily post|

Algorithm design aptitude – IMHO second most important ingredient for an SDET / Engineer

My definition algorithm design aptitude: Ability to design a complex solution using small building blocks.

And to do this, you don’t have to start with Java. The first language I learned in high school was HTML & CSS.

I learned how to find small building blocks, connect them and create a solution.

Those fundamental lessons I used everywhere – at Uni in my engineering, on job in test automation, testing, web development, big data, you name it.

And this is what I teach and stress upcoming engineers / SDETs to learn – languages will come and go, tools pop like mushrooms all over

What will stay with you, and help you through all of that – ability to visualize & design solutions using existing small building blocks..

Oh, and by the way – We all do have an aptitude to design things – we just need to learn to harness it.

0

BDD is NOT equal to cucumber

By Ali Khalid|2020-10-15T20:16:44+05:00October 15th, 2020|daily post|

As part of the automation training program I designed for Emirates IT, conducted another online session of the BDD workshop today..

While designing the course, I deliberately kept a very small portion on Gherkin and using cucumber,

and more focus on why and how to collaborate as part of BDD,

Unfortunately most people as soon as they talk about BDD – the first thing they mention is cucumber – and forget the whole conversation that’s supposed to happen before that.

0

Moving into DevOps? Start with service automation

By Ali Khalid|2020-10-12T18:56:12+05:00October 12th, 2020|daily post|

If your team is just starting your DevOps journey, where to start improving testing from?

Begin with investing in automating your services layer first.

Folks assume that’s just about writing API tests, to me this is more of a mind set change as well

The rule of thumb is, test nearest to where the production code is written,

For instance, any functionality written within a micro-service, should be tests at the micoservice level, functionality on the mesh gateway should be tested there and so on.

DevOps tools are not magically going to test and debug problems for you, unless the tests are planned at the right place, they will be flaky and have overhead while debugging.

#RedefiningSoftwareQuality #DevOps #QualityTransformation #TestAutomation

0

alikhalid

About Ali Khalid

Testing AI starts with data

What’s a data pipeline?

Interesting insights

Regulatory requirements

Continuous learning algorithms

Biases in AI products

Standardizing building and testing AI systems

Challenges ahead