Automating the automation: Automating Vstest

As part of my career, I have, on multiple occasions, found it useful to automate visual studio tests (VS tests) execution. The goal of this article is to show how to do so and why. I hope you can find it useful no matter your experience; for the experienced readers, you might get some ideas on how to use the capabilities of vstest, batch files and logs generated, setting up the basics to revisit some of the parts in the future.

If you have never created or run tests in Visual Studio, take this as a starter guide on executing automation. If you have ever run Visual Studio Unit Tests, you should be familiar with the Test Explorer tool:

At times, we might want to include these tests as part of a different system rather than executing them from Visual Studio. It is possible to use some existing functionality in the command line to help you doing so: VSTest.Console.exe is a command-line tool to run Visual Studio tests.

Steps 1 and 2 are for setting up the test data of this post, feel free of skipping them if you consider them too simple. Otherwise, I hope they can help you get started with Unit test in visual studio and general automation with it.

Disclaimer: I’m explaining this from a Windows Operating System’s perspective, but it could be done similarly from anywhere you could run vstest console tool.

Step 1: Create an unit test project

Note: needless is to say that you need Visual Studio installed to follow these steps.

We are going to start creating a test project in Visual Studio. To do so, we can click File -> New -> Project and select a C# template ‘Test‘ as in the image below.

Select the destiny folder and insert a name for it. Refer to the Visual Studio documentation if you have any issues.

Step 2: Create tests

The project should come with a unit test class by default. If for some strange reason that’s not the case, or you want to create another class, you can right click the Project and select Add->Unit Test (please note if you do this but you already had a class, then you might have a duplicate TestMethod1 in your list later on)

Now we are going to add a command inside the test that comes by default (because we are not adding test logic for now, we just add the ‘Fail’ command as shown below). We are also adding a Priority above this test. Then we can copy and paste the lines from [TestMethod] (line 9 in the image below) up to the last ‘}’ (line 14) three times to get a few tests in our test explorer. Change the names of the methods so each have a different name (for the example we have TestMethod1, TestMethod2, TestMethod3, TestMethod4.

For the example, all odd tests (TestMethod1 and TestMethod3) have priority 2 and even tests (TestMethod2 and TestMethod4) priority 1.

To be able to see the test explorer we should go to the menu Test->Windows -> Test Explorer.

And to be able to see the tests we have created we should go to Build -> Rebuild Solution.

Now, you can try and run the tests on the left hand side. They should fail because we are not adding any logic to them. If you are interested in how to do this, leave a comment below and I’ll write another article with more details on this. You can also look up the Visual Studio documentation to learn more.

In the next step we are going to learn how to execute the test cases outside Visual Studio.

Step 3: VSTest

Now, if we want to execute these tests outside Visual Studio’s environment we can use the console that is typically install under the following path:

C:\Program Files (x86)\insert VS version here\Common7\IDE\CommonExtensions\Microsoft\TestWindow

Where “insert VS version here” would be something like ” Microsoft Visual Studio 14.0 ” or “Microsoft Visual Studio\2017\Enterprise”… basically, you can navigate to your Program Files (x86) folder and look for a folder that has Microsoft Visual Studio within its name. Then continue with the rest of the path as above.

Inside the aforementioned folder you can find “vstest.console.exe“. Alternatively, you can download the nuget package and search it in the path where it’s installed.

Typically we would access this file from the command line (be it admin cmd or visual studio’s native tool command prompt)

We can open cmd (as administrator, by right clicking on it) and type “cd path” where path is the path to the above file with the correct visual studio version for your installation. It is convenient that this path is surrounded by quotes (“”), just in case there are spaces the program can recognise them as part of the path and not as a different command.

Now, you can select what type of tests to run by adding some parameters to the call, but you can test it by calling “vtest.console.exe” followed by space an the path to the dll of the test project. For example: vstest.console.exe c:\....\UnitTestProject1\UnitTestProject1\bin\Debug\UnitTestProject1.dll

You should see something like this:

Since you’ve set up your tests to fail. Later on, once your tests are finished and they are all passing you would see something like this:

If you are new to testing, you are probably wondering why would we want to run them in this complicated way instead of using the beautiful UI from Visual Studio. If so, keep on reading, this is where things start to get interesting.

Now, imagine we want to run just the first and third test method, for this we should make the following call: vstest.console.exe pathtodll /Tests:TestMethod1,TestMethod3

As you can see, now only TestMethod1 and TestMethod3 were executed (you can use the names of your test methods). Note that it should be failing for you, I’m just adding the passing image because is cleaner.

Remember we setup priorities before? So how to run the tests with higher priority? vstest.console.exe pathtodll /TestCaseFilter:”Priority=1″

In Microsoft vstest.console documentation, there are much more ways we can use this tool, including parallel runs. Have you started to see why using this tool could be very powerful? What else could we do with this?

Step 4: Creating batch files

The coolest part is that we can create a file with these calls and then, we can use that file virtually anywhere (continuous integration, night builds, as part of an agent to be executed on a remote computer…) These sort of files are called “batch files” because they run a set or batch of operations.

The first line of the batch file would be to CD into the vstest console folder. Then we can add calls to the tool for running the tests we want. Finally, we add a pause to verify that this is working fine.

To do this, just use the window’s notepad and type the instructions below. When saving it, save it with .bat extension instead of .doc or .txt.

cd C:\Program Files (x86)\insert VS version here\Common7\IDE\CommonExtensions\Microsoft\TestWindow
vstest.console.exe pathtodll /Tests:TestMethod1,TestMethod3 /Logger:Console
pause

Remember to change pathtodll to your actual project and add the right VS version. Now, if you execute this newly created file (as administrator), you should see the same results as before. Pressing any letter closes the console that opens up.

If you don’t want to see the results in a console (as if would be happening if you integrate this file with other projects), just remove the last command (pause). The logs of the results are saved in the current folder (the one for the vstest program) and we will analyse them on the last section.

Explaining continuous integration or multi-project execution would be a bit more complicated and out of the scope of this post (but do leave a comment or reach out in Twitter if you want me to explain it). But I can explain how to set up your computer to run this file every night with Windows!

For this, you need to open the task Scheduler (you can search for it in the lower left search box on Windows). Then click on “create a new basic task” on the right hand side and go through the assistant. The most important things are to specify the frequency you want the file to run and browse to select the saved .bat file. Now this file will be running with the indicated frequency automatically (if your computer is turned on at that time).

The next thing you want to do is to check the logs that this file has generated every time it was executed.

Step 5: Saving logs

Running automatic tasks and tests is awesome, but not really useful unless we know the results so we can do something about them.

First of all we should check where the logs are saved. Usually they are saved into a folder called “Test results” within the folder where you’ve run the file. Because we were using cmd (admin) and navigating to the vtests.console folder, it would be created there. In fact, that’s the reason we need administrator permission to run the file. There is a parameter with to run vtest.console to change this location, although my vstest.console was not recognising it, so I stick to the vstest.console folder for the purposes of this article.

I think trx logs are useful and should be always installed by default. To get them we can add a parameter to the vstest “ /Logger:trx“.  The generated file can be opened with the notepad and it will give you information about the run tests. However, we would focus on /Logger:Console as it is simpler.

Another way of retrieving the logs is by using the capabilities associated with Windows batch system. We just need to add ” > pathToFile\file.txt” where path to file would be the path all the way to a file with a txt extension (this file does not need to exist, you can create it with this command). This way a file will be saved with the contents of the console.

cd C:\Program Files (x86)\insert VS version here\Common7\IDE\CommonExtensions\Microsoft\TestWindow
vstest.console.exe pathtodll /Tests:TestMethod1,TestMethod3 /Logger:Console > pathToFile\file.txt
pause

You might want to save different files (by adding date and time) or replace the latest one (by keeping the same name as above), depending on the frequency that it is generated (if it takes long enough, we don’t mind replacing it).

Using the parameter “ /Logger:TimelineLogger” can give you a bit more information of the times of execution, but it will make it harder to parse later on.

Step 6: Playing with the logs

Now we have a text file with the logs…but reading all the files all the time, might be a bit boring..what to do with it? You get it, automate it!

Let’s output just the number of test case that have failed. We can do this with any programming language, but let’s keep going with batch. Why? Because I feel people underestimate it, so here it is:

@echo off
set /a FAILED=0
for /f %%i in ('findstr /i /c:"Failed" file.txt') do (
set /a FAILED=FAILED + 1
)
set /a FAILED = FAILED- 1
if %FAILED% gtr 0 (
echo Failed: %FAILED%
)
pause

The first line of the file allows the output to come up on the screen. Then we create a variable to save the number of times that we find the string “Failed”, we perform a loop with the search over the file called “file.txt”, take one out (because at the end there is a summary with the word “Failed” on it, which we don’t want to count) and only if the result is greater than 0 it is printed.

When executed for the file with priority 1 test cases failing we can see this result on a console:

If everything passes, nothing is printed for this example.

Please keep in mind that with this method, any test case that has the word “Failed” within its name will also increment the search, so this is just for demonstration purposes.

Maybe we would prefer to indicate as well the names of the failed test cases, or to print the last sentence on the file, which has already a summary.

We can also create some code that would send us an email if there are any failed test cases, or push the results into a graph, or send some sort of alert, or even a slack message… There are many possibilities, but they are…well…another story.

Hacking social media

I know, I still owe you some stories, but I am now inspired to talk about something else. Besides, today’s issue is easier to put it into words. I don’t need to sit down and think carefully on a way of explaining some technical concept such as artificial intelligence while not sounding boring. But, just so you know, I am still working on the other stories.

I would like to show you how dangerous social media could become and on one hand highlight the need of asking the right questions when a new technology comes along in order to set proper tests and barriers (lynx are curious animals, aren’t we?).

On the other hand, highlight the importance to take breaks from it and think about yourself and things that would make YOU happy instead of thinking of things that ‘would make other people think that you are happy’ and therefore approve of you. I hope you enjoy reading this article.

Let’s think about it: the most viewed YouTube channels from independent creators are from people below their 30’s (or just on them), many started them 5-9 years ago. They have been getting a lot of pressure from fans and companies that would like a piece of their influence. Celebrities with less direct exposure to their fans have done crazy things in the past because of social pressure. Yet these influencers are not invited (that I know of) to Davos or famous lists of most influential people, besides demonstrating incredible marketing strategies, knowledge of new technologies, having charisma and being very intelligent (more than they let to be seen in some cases)

Social media affects society, not only these influencers, but many people are actually feeling depress or harm themselves because of social media. It is also a potential source for propaganda of all types and an a source for advertisement of all sorts by using the platforms ‘algorithms‘ in their favor.

There is a very important point to consider, which is its the potential for hacking (if you are interested on this, there is more information here and here). So, imagine that someone could actually go there and decide what you are going to see… how could this affect you?

Techniques and prevention:

Let’s imagine a platform that accepts comments and likes (let’s forget about dislikes). How could someone socially hack it?

1) Removing the likes: We would need to intercede the information that this platform is showing to the user and eliminate the likes the user will see. Maybe we should only eliminate them partially, so the user is not suspicious of not seeing any likes at all. For this, we could have a pondered random variable that would eliminate or not eliminate each like. How would you feel if all of the sudden nothing you write gets any likes? Prevention: From test side, make sure the like system works properly and cannot be done by anonymous sources. Make sure accounts are real. Make sure the user sees only real data. From user perspective, when you see something that you like, mention it in person, start a conversation about it instead of just clicking a button.

2) Liking specific posts: This is a bit fancier. Based on above, we could have some sort of AI algorithm that could classify the posts. Then we can decide which comments are going to show as liked for the user. How would you feel if only some type of your posts would get liked? Would that change your way of writing? Prevention: From test side, make sure all information is shown to the user. From user side, find your audience and focus on them. Also, consider talking with this people directly too (or in conferences). Try to keep honest to your goals for writing and who you want to reach.

3) Filtering comments: This would require some form of classification as with the previous point. Instead of targeting the likes, we would target the comments but the idea would be the same. Eliminate from view those that are not ‘interesting’. What would you think if you only receive a certain types of comments? Prevention: From test side, make sure all information is shown to the user. Maybe have a conversation about the feature itself and allow users to hide all comments. From user side, as above.

4) Creating comments: We could create new comments with AI. You might think the user would realize about this, but if done carefully they might not even notice this or confront the person making that comment. Besides, the social media platform might allow for not-logged in comments. This adds to the feeling of the previous one. Prevention: Have a conversation about blocking anonymous comments or disabling them. From user side, if you see a strange comment from someone that does not add up, clarify this with the person. It can also help with misunderstandings. Option 2, disable or stop reading comments.

5) Changing the advertisement around the website to a convenient one for propaganda or for harm (only for some types of social media). Prevention: Most of sites have a way of deciding what advertisements you are more interested of. Also, try cleaning cookies regularly or use private browsers and VPN services.

6) Extracting automatically interesting information for malicious purposes. Prevention: Be careful with what you post, don’t use information that is available for anybody as your passwords or security questions or pictures with personal data (such passports or train tickets). If you really want to share pictures of a trip, try uploading them after the trip and enjoy while in there!

7) Connecting certain types of people. I am not 100% sure how this could be used for malicious purposes but surely someone would find a way. Making sure you can block people is also very important.

8) Taking things out of context. Prevention: It’s very hard to delete something from the internet once it is out there, but some platforms allow it. Have you ever read your old posts? It is a good idea to do some clean-up every so often. Also, if this happens to you, keep a track of the entire context. Maybe have a system in which you can remove what you have written before it goes online, take some hours before posting to make sure you want to post that.

Why am I talking about social hacking social media in a test blog? Well, because, if you happen to be working on developing a social media project, you should make sure that these attacks are not possible and think about how the user could feel for other features to come.

(Please take some time to go throughout the articles I liked above to know more)

Thoughts about being addicted to being connected:

When was the last time you did something good for someone but did not tell anybody about it? When you do something good for someone and post about it, how do you know you are doing it because of the other person and not to have a better image of yourself in front of others?

When was the last time that you went for a trip and didn’t share the pictures with anybody? What about not taking any pictures at all? There is something truly special about having a memory that is just yours to have.

Conclusion

Social media has evolved quickly in very short time, and we need to consider a lot of new things, more so if we are in the development team of one of these platforms. We should really stop and agree about what is ethical and not with the particular platforms and maybe even list the set of contraindications as if we do with addictive substances. For example, would it be ethical for some platforms (maybe for young people?) to change what the users see in order to protect them from the bad critic? Consider this could, in theory, save some lives, but, in contrast, it would take away some potential good feedback disguised as bad comments. Maybe this is a feature you want people to turn on and off? If not, maybe you should list in your contraindications that it could be an issue. And I don’t mean terms and conditions, it’s not about saving your behind if anything happens, it’s about actually alerting the user about what could be experienced. Terms and conditions are…well.. another story.

Sources:

Some sources if you are interested in this area that helped me control my internet usage:

Book: “How to break up with your phone” by Catherine Price.

Watch: Crash courses on navigating digital information

Automating test case decision (using AI in testing part I)

1. The problem (and possible actions):

While testing, we need to decide carefully what test cases we will create, maintain, remove and execute per deployment.

Imagine that you join a company and get handled over a long list of test cases. You know absolutely nothing about them and you need to decide which ones to use for production (you have a time restriction of 10 minutes to execute them). What would you do?

  1. Try to understand which of the existing tests are needed and decide manually which ones to run:
    1. Check the priority of these test cases. Unfortunately, not many people review the priority of the test cases, so you can have obsolete test cases that are still marked as high priority but might be covered by other tests or the original functionality no longer be in place.
    2. Check the creation date. However, sometimes, an old test case might still make sense or be important.
    3. Ask the existing testers. Although, sometimes they have moved out of the company by the time you join and if not, things change so quickly that they might not be able to help anymore.
  2. Scrap it all and start over. I think this is a drastic solution, it might work out, but you might be wasting time re-doing something that might already be working fine.
    1. You could decide to just test the latest feature and not do any regression (trusting that the system was well enough tested before)
  3. Spend days learning about the features, executing all the test cases and figuring out what tests what and which tests you need to re-do. It’s a very analytic approach, but you are not likely to have the time for this, even if you have a lot of resources to execute them in parallel (which you should try to do). Also, maybe you need to refactor some of them, so you still need to do a selection.
    1. You could decide to leave comprehensive test for after deployment and only focus on a small set of features before that.
    2. You could do the deployments at hours where the load is small and do them more often (although this is generally painful for the team)
  4. Use new technologies to figure out which test cases to run (for example AI).
  5. Mix and match: Implementing point 4 on its own could be tricky. The best would be to mix it out with the others, analyzing and reviewing test cases, selecting higher current priorities, executing them in parallel to verify the percentage of success, eliminating test cases that don’t make sense anymore or that constantly fail…

As lynx, we are curious animals and we tend to ask many questions to understand the system. For example, some of the questions you could ask are:

  • How often are the iterations of the projects? If there are fast iterations, chances are that old test cases are not needed anymore.
  • How long do we have to verify a build?
  • Are the technologies from development changing? If they are, it would be a good moment to change on testing too, and point 4 could be a good solution here. I think it’s always good to have similar technologies between development and testing so both teams feel aligned and can help each other in a better way.
  • Do you have available testers in the company to whom to ask about the recent features and tests? If so, you can start with 3 adding up to 1 and 2 (so you don’t bother people with silly questions).
  • Is priority aligned within the company? Is priority per build or per feature? Is there a clear list of features per build? Is there a clear way of tracking which old features might be affected by the new ones?

It’s important is to balance well the test cases to get as many defects as possible and as early as possible, and also to ensure there is no overhead on the process.

Some tests can create false failures or be not reliable. Also, I’d like to highlight that sometimes writing tests takes too long or needs too many resources and some testers would write those test for the sake of ticking the “automated” box. That is not a good practice, be careful with these.

2. Understanding the process (how do we test)

Every time we want to automate anything (in this case we want to automate human decisions), we need to think about the manual way of doing it: When, as human, we decide which test cases to execute, what are we basing our decisions on? We want to check priority (of test cases and feature) and creation date. We might also take into account the severity of the test and feature (how costly would it be to fix a defect related with those). Another thing could be to look at previous runs and check how many times has this test case been failing or how many defects have raised already.

Note that the measures themselves are also estimated – it is important to have a good process in terms of the estimation. The first thing is to clean up the test cases and the system (process) itself. Having good documentation around when something is considered high priority or high severity could help out when aligning the system across the team or the company.

The second thing we need to do for automating tests decision is to decide which variables we are going to take into account for our system. Some of the above mentioned could actually measure the same thing. Having a short and clear number of variables is essential in order to build a correct system, since the more variables the more complicated the system would be and the longer it would take for it to make decisions.

An example or two variables that could be measuring the same thing could be the priority of the test case and the priority of the feature, if the system is well assigned.

There are tools and algorithms thought to identify automatically which variables are actually more important for the data or what sort of relationship there are among them, as this is sometimes not obvious for a human. Just have this in mind when creating your system (as this is usually topic 1 in any machine learning related book).

3. What’s AI

In order to automate these decisions, we could make use of one of the technologies that is being trending recently because of the new systems being able to compute it faster and the creation of better algorithms: Artificial intelligence.

According to Arthur Samuel in 1959, Artificial intelligence gives “computers the ability to learn without being explicitly programmed.”

Artificial intelligence is a big area, and there are many ways we could use it to help with testing.

Note also, that this is not a simple topic and there are many people who have dedicated their entire careers to artificial intelligence. However, I am simplifying it as much as possible since I’m taking this as an introduction and overview.

For this story, I am going to focus in using artificial intelligence to decide among test cases. I found two interesting ways of doing this. The first one is called “rule based system”.

4. Rule based system:

A rule based system is a way to store and manipulate knowledge to interpret information in a useful way. For us, we would like to use fixed rules in order to get an automatic decision of if we want to execute our test case or not. Imagine this as if you wanted to teach it to a newbie who needed your logic to be written down in notes.

For example: If risk is low and priority is low and test case has run at least once before, then do not run the test case. This rule would not act on its own, but mixed with a long list of rules written in this style (which is related to logic programming, in case you want to learn more about it). The group of rules is called “knowledge base”.

In this system, there is no automatic inference of the rules (which means that they are given by a human and the machine does not guess them). But there are some cycles that the machine goes through in order to make a final decision:

  1. Match: The first phase would try to match all the possible rules with each test case creating a conflict set with all satisfied rules.
  2. Conflict-Resolution: One of the possible rules is chosen for execution for that test case. If no rules are satisfied, the interpreter halts.
  3. Act: We mark the test cases as execute or not execute. We then execute and can return to 1 as the action have changed the property of the tests (last executed, passed or failed…)

 

5. Fuzzy logic – hands on:

If you ask experts to define things like ‘high priority’, ‘new test case’ or ‘medium risk’, they probably will not agree among them. They can agree that a test case is important, but when exactly are they marking it with priority 3 or 2 or 1 (depending on your project’s scale) would be a bit more difficult to explain.

In a fuzzy system, such as our, we define things with percentages and probabilities.  If we gather the information of the particular definitions for a variable, we will find it follows a specific function, such a trapezoid, triangle or Gaussian.

Imagine that we asked a lot of experts and come up with the example below:

Let’s define ‘low’ as a trapezoidal function starting on the edge (minimum value) and travelling to 20 and 40.

‘Medium’ would be the same function on the points 20, 40, 60, 80 (note that they overlap)

‘High’ shall be 60, 80 and maximum value.

The graph would represent our system as such:

fuzzylowmedhi

If we decide on the variables (for example ‘priority’) and definitions (also called labels, for example, ‘low’), the functions that compose those labels (as the graph above) and the rules among the variables, we should be able to implement a system that would decide for us if we should run a test or if it is safe to go without it. Let’s do so!

After a bit digging for a good C# library to implement this sort of things (maybe using F# would have been easier), I came across: http://accord-framework.net which seems to be a good library for many AI related implementations. We can install its NuGet Package with visual studio.

The first thing we need to do is define a fuzzy database to keep all these definitions:

Database fdb = new Database();

Then we need to create linguistic variables representing the variables we want to use in our system. In our case, we want to look at priority, risk, novelty of test case and pass-failure rate. Finally, we will like to define a linguistic variable to store the result, that we are calling ‘mark execute’.

 LinguisticVariable priority = new LinguisticVariable("Priority", 0, 100);
 LinguisticVariable risk = new LinguisticVariable("Risk", 0, 100);
 LinguisticVariable isNew = new LinguisticVariable("IsNew", 0, 100);
 LinguisticVariable isPassing = new LinguisticVariable("IsPassing", 0, 100);
 LinguisticVariable shouldExecute = new LinguisticVariable("MarkExecute", 0, 100);
// note on the last one that the name of the variable does not have to match the name for the rule,
//      which is the string literal that we are assigning it

After that, we define the linguistic labels (fuzzy sets) that compose above variables. For that, we need to define their functions.

For demonstrative purposes, let’s say that we have the same definitions for low, medium and high for priority and risk. For novelty, pass rate and mark execute, we are going to define a yes/no trapezoidal function. Note that we cannot use ‘no’ as it is a ‘reserved word’ for the rule specifications (more below), so we would call it ‘DoNot’. The yes/no function graph that we are using looks like this:

fuzzyyn


// defining low - medium - high functions
TrapezoidalFunction function1 = new TrapezoidalFunction(20, 40, TrapezoidalFunction.EdgeType.Right);
FuzzySet low = new FuzzySet("Low", function1);
TrapezoidalFunction function2 = new TrapezoidalFunction(20, 40, 60, 80);
FuzzySet medium = new FuzzySet("Medium", function2);
TrapezoidalFunction function3 = new TrapezoidalFunction(60, 80, TrapezoidalFunction.EdgeType.Left);
FuzzySet high = new FuzzySet("High", function3);

// adding the labels to the variables priority and risk
priority.AddLabel(low);
priority.AddLabel(medium);
priority.AddLabel(high);
risk.AddLabel(low);
risk.AddLabel(medium);
risk.AddLabel(high);

// defining yes and no functions
TrapezoidalFunction function4 = new TrapezoidalFunction(10, 50, TrapezoidalFunction.EdgeType.Right);
FuzzySet no = new FuzzySet("DoNot", function4);
TrapezoidalFunction function5 = new TrapezoidalFunction(50, 90, TrapezoidalFunction.EdgeType.Left);
FuzzySet yes = new FuzzySet("Yes", function5);

// adding the labels to novelty (isNew), pass rate (isPassing) and markExecute (shouldExecute)

isNew.AddLabel(yes);
isNew.AddLabel(no);

isPassing.AddLabel(yes);
isPassing.AddLabel(no);

shouldExecute.AddLabel(yes);
shouldExecute.AddLabel(no);

// Lastly we add the variables with the labels already assigned to the fuzzy database defined above

fdb.AddVariable(priority);
fdb.AddVariable(risk);
fdb.AddVariable(isNew);
fdb.AddVariable(isPassing);
fdb.AddVariable(shouldExecute);

That was a bit long, still with me? We are almost done.

We have defined the system, but we still need to create the rules. Next step is creating the inference system and assigning some rules.

Note that for this implementation the rules are not weighted. We can make it a bit more specific (and complicated) assigning weight to the rules to denote their importance.

Also, note that these rules are defined in plain English, making it easier for the experts and other players on the project to contribute to them.

InferenceSystem IS = new InferenceSystem(fdb, new CentroidDefuzzifier(1000));

// We are defining 6 rules as example, but we should take them from experts on the particular system. The rules don't necessarily need to work out for every system.
IS.NewRule("Rule 1", "IF Risk IS Low THEN MarkExecute IS DoNot");
IS.NewRule("Rule 2", "IF Priority IS High OR Risk IS High THEN MarkExecute IS Yes");
IS.NewRule("Rule 3", "IF Priority IS Medium AND IsPassing IS Yes then MarkExecute IS Yes");
IS.NewRule("Rule 4", "IF Risk IS Medium AND IsPassing IS DoNot THEN MarkExecute IS Yes");
IS.NewRule("Rule 5", "IF Priority IS Low AND IsPassing IS Yes THEN MarkExecute IS DoNot");
IS.NewRule("Rule 6", "IF IsNew IS Yes THEN MarkExecute IS Yes");

Finally, we need to set the actual inputs or values from the tests. The ideal scenario would be that we retrieve them from a file. We could automate the extraction of the variables of our tests into this file from our test case database.

For this example we are typing the values directly. Let’s think of a test case with low priority (20% low), low risk, quite new (is 90% new) and with low passing rate (since it is new, that makes sense). This would be defined as this:

IS.SetInput("Priority", 20);
IS.SetInput("Risk", 20);
IS.SetInput("IsNew", 90);
IS.SetInput("IsPassing", 10);

If we want to define a test case with high priority and risk, old and with high passing rate, the variables would look something like this:

 IS.SetInput("Priority", 90);
 IS.SetInput("Risk", 90);
 IS.SetInput("IsNew", 10);
 IS.SetInput("IsPassing", 90);

For now, let’s get the outputs directly on the console. It would look like this:

try
{
float newTC = IS.Evaluate("MarkExecute");
Console.WriteLine(newTC);
Console.ReadKey();
}
catch (Exception e)
{
Console.WriteLine("Exception found: " + e.Message);
Console.ReadKey();
}

The result of passing the first test case to this system is that we should execute it with 49.9% of security and for the second we get 82.8%.

After playing around for a while with this particular set of rules, I’d say that the system is a bit pessimistic and plays a bit too safe. It’s hard to get values under 50% (which we could assume it’s safe not to execute those test cases).

6. Rule based system – conclusions:

  • An expert / experts are needed to specify all the rules (we might influence the system. In the example above, I’m making the system too safe)
  • These rules won’t automatically change and adapt; we need to add new rules if the situation changes
  • The rules are hard to define: shall we always run all the cases when risk is high and feature is old?
  • Fuzzy definitions and fuzzy results make the system a bit complicated to understand and, again, to define
  • There could be relationships between the variables that are not obvious to us
  • We need to parse the test case variables in order for them to make sense in the system (a bit more of automation)

The problem about a human deciding the rules and the variables is that some of these variables could be measuring the same things or relate to each other without it being obvious to us.

An example could be: when a feature is new and the risk is high there might be a low probability of the test case to fail, so we might not need to execute it. This could happen because, knowing that the risk is high, developers might put more efforts on the code. (Note: This is hypothetical, not necessarily the case)

That is why, while it is important to analyse as many variables as possible, we still need to get a compromise and try not to fall on these cases, for which we need the experts… or a system to discover automatically the importance of the variables. But this is…well…another story.

 

 

Virtual Reality start pack (and VR Udacity nanodegree experience)

If you are a regular reader of my blog, you are likely to expect a testing story. This is not exactly such, but have some patience, as I will link it with testing in upcoming posts.

Virtual reality is a field I was always curious about, from when I was a little lynx, before it was even possible to bring it to the users (as the devices were not quite portable back then).

On the other hand, I was looking to do a nanodegree course from Udacity to learn something new and keep myself updated. When you are working as a developer for a while (especially developers in test) you need to keep up to date. I once heard a feedback about a candidate that was interviewed by some friends, that he did not really have 10 years of experience, but he had a repeated 1 year of experience in a loop of 10, and this can easily happen to anybody. To avoid this, what I do and I recommend doing is: keep learning new stuff.

And so, I decided to course the VR nanodegree course in Udacity.

Advises if you are considering VR nanodegree course:

The first thing you need to know about virtual reality is that you will need a device for it, otherwise you won’t be able to test anything you do.

The second thing you need to know about VR: if you want to work in multiple platforms, you also need multiple devices. This might seem obvious for you, but most of the current technologies (think about mobile, for example) have emulators to be able to deploy and test in different devices. However, for VR this is not there yet (as for the time I’m writing this article and my knowledge).

So, if you are planning in getting into the nanodegree course and into actual VR development, get ready to purchase an HTC vibe or an Oculus Rift, unless you are lucky enough to be able to borrow one or unless you prefer to take the speciality about cinematics and 360 recording. I ended up picking this speciality. Not that I did not want to spend the money on a VR cool device that would allow me to also play cool games, but I have recently moved countries (and continents, in fact) and I did not want to carry much stuff with me around.

One more thing to take into account: VR devices might come with minimum computer’s specifications, so you might also need to update your computer in order for the device to work properly.

Lucky for us, in VR we can also develop for mobile, which only requires a cheap device in which to connect your phone (and you could even build your own one). You can’t do as many things as if you also have hands controllers and body movement detectors, but you can still do some cool things. For the nanodegree first modules, this is all you need and the course provides a cardboard to the students (which is great because it has a clicking option and some other devices don’t have this one)

However, there is another thing that you could also get stuck with, although I think there are some workarounds but it would at least slow you down: you cannot directly develop for IOS phone from a windows device, you should do it from a MAC.

In terms of content, I would advise you to be interested in multimedia and gaming if you decide to go through the course.

Feelings about the course itself

I actually really enjoyed the course (except for the speciality that I was forced to course because of lack of devices). I think the content is quite good and the projects are challenging and open for creativity.

It’s also great to network with other people with interest in VR.

In terms of testing on VR, there is currently not a module about this, but they do explain many things about performance, VR and what a good VR application should look like, so I believe this content is covered across the course.

Where should I start if I want to learn more about VR?

First of all, I think you can do this course for free if you don’t mind not having the degree (you cannot access the evaluations). That could be a good starting point and you could always join for the assessments after, which might save you a bit of time and money.

However, I’d say that the best way to get started is to actually try a device and try some apps. In Android you can download the ‘cardboard’ app and expeditions. Also, you can look for VR apps or games in the phone store (whichever your phone’s OS). Another way could be checking in steam (with a more expensive device), youtube or even in github to see someone’s code. For example, you can check out mine.

Last but not least, you can also install unity as it has an emulator that might give you an idea on how the world would look like and try to start playing around. There is plenty of documentation about it. Another good tool to check out is unreal, you don’t need as much development skills with this one.

What next?

So, you have checkout some VR apps and devices. You might even have created some small apps. The next step would be to be able to tell if your apps (or someone else’s) are of good quality (this is a test focused blog, after all).  For this, we should have in mind some new considerations for every type of testing, but that’s…well…another story.

Examples of AI applications and how possibly test them

Recently I attended an online crowdchat hosted by ministry of testing about testing AI applications.

The questions were very interesting, but it was hard to think of a right answer for all AI applications, as this is a very broad field. Explaining it over twitter would be confusing, so I thought I may as well create a post giving some examples.

Kudos to someone on Twitter that mentioned supervised and unsupervised learning at the end of the chat. I was very sleepy at the time (the chat started at 4am my time) so I was not able to find his tweet in the morning to vote for it. I think that we could understand better the types of AI applications that we could have if we divide them in supervised vs unsupervised. More information here.

Supervised learning examples

The idea behind it is easy to understand: These applications have a phase of learning in which we keep feeding them with data and rewarding them if they produce a correct result, while punishing them when they don’t until the produced results match the expected results within a threshold. (In other words, until we are happy with the results).

Let’s ignore for now the exact ways in which we can punish or reward a machine and just focus on the general idea.

After this learning phase, we generally just use the application and no more learning takes place. We “turn off” the learning. This is called inference phase. Not all the applications have inference phase; sometimes we want them to keep learning from the users, but this can turn out to be problematic, as we will see further on.

I think these are the easiest AI applications to test functionally speaking, as we just need to pass new data and check the results obtained against the expected. Apart from this, they behave just like any other application and we can also go through the other types of testing without many changes (performance, security, system..)

NPR / OCR:

Imagine for example a number plate recognition system – once the system learns how to recognize the numbers in the license plate, you don’t have to keep training it. The application can use the learned patterns to verify the new number plates.

There are many tests we could think of here, without caring for how the application gets the results: try with characters that have strange typography (if allowed in the country), tilt the number plate, check the boundary in distance from the vehicle…

An OCR (optical character recognition) application could also be done with this technique. In fact, the number plate recognition system could be considered as a specific type of OCR.

Digital personal assistance (Cortana, Siri, Alexa…):

Quite common nowadays, they help you find out information using voice commands. They could also use supervised learning (although, I believe the right classification for them would be “semi-supervised learning”, but let’s think of them as just supervised for the sake of the example). However, in this case the application keeps learning from the users. It stays in the learning phase.

The reason they can ‘safely’ do this it is because they collect data from the users but not their direct input in whether the result was to be penalized or rewarded. An example of application getting direct input from the user to keep learning would be a chatbot that guesses something and asks if that guess was correct. This could be easily tricked by dishonest users.

Applications that keep learning are much trickier to test, even functionally, as if we pass wrong inputs to test, they will learn wrong. If I had to test one of these, I would use a copy of the state of each iteration we would like to test in an isolated environment, so we don’t break the acquired good learning. For performance testing it would be best to use valid data, to ensure the learning process continues well.

If anybody is concerned about AI gaining consciousness, this type of applications would be the problematic one, as they could be learning things we are not aware of depending on the power that the programmer and the user gave them and the data they are able to collect. This brings up the question: Should testers be responsible to test consciousness?

Unsupervised learning examples

The key of these applications is to discover relationships on the data without direct penalization or reward. They are very useful when we are not sure of what the output should be, and to discover things that we would not naturally think of being related.

There are two types: Clustering (when the system discovers groupings in data) and association (for discovering rules describing data). I won’t go deep on them in this post, as it is a lot of information as it is.

Tailored content-advertising (Amazon, Netflix, Google…)

These apps want to be able to predict what the customers that bought something would be interested on next. In fact, digital personal assistance tools could also use this data to help you find what you want (that’s why I mentioned before they should be classified as ‘semi-supervised’ learning). I cannot think of any ways of testing this except checking on the impact on the sales after the application is in place, but this could potentially be subjective to chance or other factors not related with the application itself.

Apart from that, the test of the application should be the same as we already do with non-AI applications (not just the results, but how the user inputs the data and how the application responds and shows back the data…) Imagine this as a feature of a bigger product, all the other features would also need to be tested as well.

The moral impact of these applications, in my opinion, is that at some point they might be telling you (as a user) what you want, even before you know you wanted it.

What could possibly go wrong?

What should we be careful about in AI that might not need so much attention in other  apps?

Things could go very wrong if we leave apps learning constantly and we leave the users to provide the penalization or rewards. You probably have heard of applications such as image recognition systems and chatbots becoming racists and sexists. Sometimes this is because the test data given to the application is biased, but it could also be because of trolls playing around with the application in unexpected ways and giving rewards when the application is wrong.

Also leaving apps learning on their own is not the best idea, as we do not control what they are actually learning, as mentioned before.

If you are interested, I found an article with some more examples of issues with AI applications here.

What else have you got?

Below is a list of readings that I found very interesting while researching for this post (a couple of the links are about video games and AI):

How “hello neighbor” game’s AI works

AI predicting coding mistakes before developers make them

Examples of AI

Game examples of AI

How would you test these applications?

What do you think about the moral connotations?

If used well, AI could be harmless and powerful. In fact, it could also be a good tool that we could use for automating our testing, but that’s…well…another story.

My views in the future of testing

As you might know by now from my previous posts,  I’ve always been between testing and developing.

Every company is different and every person in the team is different. Some developers highly respect testing, write tests or at least like to learn to think like testers. There are developers that use testers as a shield against management’s disappointment when something goes wrong in production.

Some testers know how to code and are interested in new languages, technologies, systems… others might know some programming but just enough so they can stay in a comfortable job (this could apply to the devs too). Others are great testers although they do not know how to program (and that should not be an issue or be taken lightly by the rest of the team).

I have noticed a lot of confusion and open interpretations on the testing side of development and I have seen many movements on that side: I have seen some companies moving from QA to SDETs, and others moving from SDETs to Software engineers. It seems like this is the new fashion, the process to follow, developers to do both dev and test tasks (and operations as well, if possible…).

I have been concerned, to be honest, that I might end up in pure development and I would be fully extincted.. I felt as endangered as my lynx specie is.

The introduction of TDD and then of BDD, makes companies ignore other types of testing (maybe with exception of security) in favour of deploying earlier and faster, thus leaving the end to end ‘testing’ directly to the users.

However, I think this is not the right approach. Don’t get me wrong, I was already developing and testing, I would have ‘no issues’ moving with this current. But at the end of the day, if nobody is in charge of the quality, why would anybody care about it?

If previous testers are either gone or converted to developing first and thinking about quality second.. at some point, after hiring more and move developers and promoting thinking about speed of development first, who is going to think about quality? Who will have the knowledge to make a point or call out missing scenarios?

This might as well be a cyclic fashion, and at some point companies would be needing a test expert. Someone that cares about quality and creates integration and system testing, checks that the unit tests are right, verifies which tests are going to be run in the different builds, creates other types of testing, researches about testing, architect solutions for automation… Or at minimum someone that can train the developers to check for the right things and ensures the right testing happens.

How can you get test experts if there are no testers anymore? At the moment, the answer is: from other companies that are still using testers… but what if there are no such companies anymore?

I’ve heard complaints about how hard is to find testers that also program or have development qualities. Or developers that are OK being called testers and undertaking tasks in the test system. Maybe this post helps someone understand the paradox.

So, if we don’t find testers.. is it ethic/safe to deploy without comprehensive end to end and allow the users to report the issues in order to speed up development? To me, it is not, although I can understand where this comes from and that in some environments it’s very hard and costly to find alternatives.

Will this be the new quality assurance culture? I don’t think so, but that doesn’t mean that the things will go back to the way they were.

Should I be scared of losing my job? Well, that highly depends on your company, so I cannot answer that in an easy way.

What can I do to make sure I will stand out? I’d say, try to make yourself valuable, just as with any job. And I do not mean writing spaghetti code, (although high respect to the pastafarians, arr!) I mean knowledge wise.

I think there is much more to come. Research to be done. New ways of automating, new processes, development, test expertise …

And that’s why I think the future of testing is uncertain, but it will be different to what we have known until now. It has started to change and I think we better take on that boat and participate in the change that let us be driven by it.

Where to start? We should start by automating everything, including the automation. But that would be… well… a different story!