Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrich Quark tests #1

Open
haeter525 opened this issue May 26, 2021 · 6 comments
Open

Enrich Quark tests #1

haeter525 opened this issue May 26, 2021 · 6 comments

Comments

@haeter525
Copy link
Member

haeter525 commented May 26, 2021

Describing the issue
Quark adds lots of features in the last few years. But most of those are not well tested, and the reached stage of an APK isn't either. The overall coverage of Quark is 76%.

Why is this important?
As the table below, the analysis core of Quark is not fully covered by tests. ( Avg. 80%)

  1. The Dalvik bytecode loader - Avg. 78.5% (pyeval.py, tableobject.py)
  2. The APK information supplier - Avg. 91% (apkinfo.py)
  3. The analysis implementation - Avg. 74% (quark.py)

Also, some of the other components come without any tests. (Avg. 1%)

  1. Command Line Interface - cli.py, freshquark.py
  2. Call Graph / JSON Report - graph.py output.py
  3. Public API - report.py

Third, current tests contain only one overall tests (means that giving an iconic APK and testing the reached stage is correct.). All the five stages of Quark should be tested by iconic APKs, or the analysis outcomes may not be stable enough between versions.

The lack of tests has led to a bad user experience for Quark. ( #111, #136, #145 ) And for the later replacements of the Quark core library, it is necessary to enrich the tests, too.

Name                                  Stmts   Miss  Cover
---------------------------------------------------------
quark\Evaluator\__init__.py               0      0   100%
quark\Evaluator\pyeval.py               143     28    80%
quark\Objects\__init__.py                 0      0   100%
quark\Objects\analysis.py                32      0   100%
quark\Objects\apkinfo.py                137     12    91%
quark\Objects\bytecodeobject.py          17      1    94%
quark\Objects\quark.py                  229     59    74%
quark\Objects\quarkrule.py               33      2    94%
quark\Objects\registerobject.py          33      1    97%
quark\Objects\tableobject.py             22      5    77%
quark\__init__.py                         1      0   100%
quark\cli.py                             76     76     0%
quark\config.py                           5      5     0%
quark\forensic\__init__.py                1      0   100%
quark\forensic\forensic.py               45      1    98%
quark\freshquark.py                      31     31     0%
quark\logo.py                             4      4     0%
quark\report.py                          25     25     0%
quark\utils\__init__.py                   0      0   100%
quark\utils\colors.py                    30     10    67%
quark\utils\graph.py                     61     58     5%
quark\utils\out.py                       17     10    41%
quark\utils\output.py                    30      9    70%
quark\utils\regex.py                     41      4    90%
quark\utils\tools.py                     14      3    79%
quark\utils\weight.py                    31      4    87%
setup.py                                  5      5     0%
...
---------------------------------------------------------
TOTAL                                  1484    357    76%

How are you going to do it?
My strategy is to make the tests cover every function in a module except those

  • contain only one statement (function wrapper), or
  • log/print fixed strings only, or
  • decorated private (start with an underscore).

I will write tests in the following sequence, submit one PR for each item correspondingly.

  1. The analysis core of Quark (quark/Object/*.py)
  2. The side components (quark/*.py)
  3. The overall tests for each stage

After those, I think the coverage may increase to 90%~95%.

@krnick
Copy link

krnick commented May 26, 2021

Hello @haeter525

Enriching code coverage is quite important and essential to prevent the expected errors of the project. However, the coverage rate doesn't have to be 90% or higher. Also, you can't prevent unexpected errors.

The most important thing is that the tests should be well-tested, not just for raising the coverage rate. Therefore, you should consider what is well-tested code means and develop your strategies to achieve it.

Third, current tests contain only one overall test (means that giving an iconic APK and testing the reached stage is correct.). All the five stages of Quark should be tested by iconic APKs, or the analysis outcomes may not be stable enough between versions.

Before you mentioned that Quark only one overall test, you should figure out the difference between unit testing and integration testing.

Therefore, I prefer to see what strategies you propose to achieve better tests, rather than just to increase the number to do that.

@haeter525
Copy link
Member Author

haeter525 commented May 27, 2021

Hi @krnick

Enriching code coverage is quite important and essential to prevent the expected errors of the project. However, the coverage rate doesn't have to be 90% or higher. Also, you can't prevent unexpected errors.

I agree with that. Raising code coverage can't test programs well. Also, it can't avoid any unexcepted errors.

But requirements coverage can. It measures the validation ratio of a method's requirements. By raising it, you can lower down the probability of occurring an unexcepted error.

There are two strategies to approach.

  1. Boundary Value Analysis - include the boundary values to represent an input of a method.
  2. Equivalence Class Partitioning - classify the requirements into partitions to decrease the number of tests.

These strategies have the following advantages.

  • Enforce a coder to have a comprehensive knowledge of a target method.
  • Provide enough confidence without redundant tests.
  • Find out the potential use cases.

Here are the steps.

  1. Clarify the requirements.
  2. For each input, divide it into partitions by its value range/domain.
  3. Find boundary values for those partitions containing ranges.
  4. Find a random value for the rest partitions.
  5. Merge those partitions that share the same test data and the excepted outcome.
  6. Write tests according to their test data.

For a quick example, please refer to the following comment.


Also, to write a qualified test, I am going to follow a guideline from a famous book - "The Art of Unit Testing"

The book divided a good test into three dimensions: Readability, Maintainability, and Trust.
Here is the simplified version of the guideline.

  • One test only tests one thing.
  • Tests are separated into third parts, arrange, action, and asserts.
  • Tests should be repeatable and fixed.
  • The name of variables should be meaningful.

I will follow the above guideline and the strategies to deliver a qualified and usable test set.

References
Boundary Value Analysis & Equivalence Class Partitioning.
Test Review Guidelines - The Art of Unit Testing

@haeter525
Copy link
Member Author

haeter525 commented May 28, 2021

Quick Example

Write tests for a method that determines if a number with three digits is smaller than 500.

def fun(three_digit_num):
    if three_digit_num/100 != 3: 
        raise ValueError('Not a number with 3 digits')
    return three_digit_num < 500

Step1. The requirement assumes that the input contains three digits. The method returns True if it is smaller than 500, otherwise False.

Step2. Divide the input domain into partitions.

Valid Input Invalid Input
Type numeric types non-numeric types
Number of digits equal to 3 greater than 3 / smaller than 3
Number smaller than 500 greater or equal to 500
# Partitions Test data Excepted Outcome
1 Numeric types - True / False
2 Non-numeric types - TypeError
3 Number of digits == 3 - True / False
4 Number of digits > 3 - ValueError
5 Number of digits < 3 - ValueError
6 Number >= 500 - False
7 Number < 500 - True

Step3. Find the boundary values for #⁠3, #⁠4, #⁠5, #⁠6, and #⁠7.

# Partitions Test data Excepted Outcome
1 Numeric types - True / False
2 Non-numeric types - TypeError
3 Number of digits == 3 300 True / False
4 Number of digits > 3 20 ValueError
5 Number of digits < 3 400 ValueError
6 Number >= 500 500 False
7 Number < 500 499 True

Step4. Find a random value for #⁠1 and #⁠2.

# Partitions Test data Expected Outcome
1 Numeric types 300 True / False
2 Non-numeric types None TypeError
3 Number of digits == 3 300 True / False
4 Number of digits > 3 20 ValueError
5 Number of digits < 3 4000 ValueError
6 Number >= 500 500 False
7 Number < 500 499 True

Step5. Merge partitions #⁠3, and #⁠7. They share the same outcome and can use the same test data.

Step6. Write tests according to the rest partitions.

# of Partition 2 4 5 6 7
Test Data None 20 4000 500 499
Excepted Outcome False False False False True

@krnick
Copy link

krnick commented May 31, 2021

Nice work! I think you have a good understanding of how to design a strategy to handle these tests.

The existing situation will be more complicated than this test case because the data we input is the unknown source Android APK. It will make us more difficult to predict its behaviors for our testing.

One quick question.

  1. In your case, if the input is a string or any non-numeric type, then return false. Is there a better solution?

@haeter525
Copy link
Member Author

haeter525 commented Jun 2, 2021

Hi, @krnick

Yes, when inputs are unexpected, the better solution is to raise an exception instantly. This idea is called Fail Fast.

To be more specific, Python suggests raising two built-in exceptions to handle unanticipated inputs.

  • If inputs are unexpected types, a TypeError exception should be raised.
  • If inputs contain unexpected values, a ValueError exception should be raised.

In my case, there are some partitions not following the suggestion. I have modified my example to make it precise. Thank you!

  • Raising a TypeError instead of returning False at Partition #2.
  • Raising ValueError at Partition #4 and #5.
  • Adjust the example method.

@krnick
Copy link

krnick commented Jun 2, 2021

Exactly, that's the right way to do it!

I think you can get started with your coding. Please open an issue on the quark-engine repo to let me know which tests you would like to start first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants