XML projects

Binary Valentine supports XML projects, which are fully self-contained and can be used to specify all analysis options and files/directories to scan. To do analysis with the XML project, use the --config (or -c) command line option. Note that you can not use any other options together with this one.

binary_valentine_cmd.exe -c my_project.xml

To learn more about the command line and the possible return codes see the Command line page.

Project format

Let us take a look at the project file example and use it to learn about all possible options. Note that the XML file must have the UTF-8 encoding.

<plan root_path="C:/my_project" thread_count="3" max_loaded_targets_size="2G" combined_analysis="true">
  <global_selector>
    <exclude_reports>
      <report>PE001</report>
      <report>PE002</report>
    </exclude_reports>
    <exclude_levels>
      <level>info</level>
    </exclude_levels>
    <exclude_categories>
      <category>format</category>
    </exclude_categories>
    <report_filter report="PE082" selection_mode="exclude" aggregation_mode="any">
      <regex arg="dll">kernel32.dll</regex>
      <regex arg="api">RegOpenKey[AW]</regex>
    </report_filter>
  </global_selector>

  <targets>
    <target recursive="true" path="executables">
      <rule_selector>
        <exclude_reports>
          <report>PE003</report>
        </exclude_reports>
      </rule_selector>
      <filter>
        <include>dll$</include>
        <exclude>api\-ms\-</exclude>
      </filter>
    </target>
    <target path="additional/binary.exe" />
  </targets>

  <reports>
    <terminal />
    <text>report.txt</text>
    <sarif>report.sarif.json</sarif>
  </reports>
</plan>

Plan

<plan> must be the root node of the XML file. All attributes of this element are optional:

  • root_path - this path will be used as a root path for all relative paths used in the XML file later. If absent, a current directory path will be used as a root path.
  • thread_count - Binary Valentine is a multithreaded application, and it allocates a number of threads which corresponds to the number of processor cores by default. Each file is analyzed by a separate thread, and there is also a common thread which loads the files. This attribute can be specified to override the default number of threads.
  • max_loaded_targets_size - By default, Binary Valentine reads up to 1 gigabyte of data into memory. When a preloaded file gets analyzed and freed, Binary Valentine reads another file with the total size of loaded files not larger than 1 Gb. This attribute can be specified to override the maximum amount of memory to use for preloaded files waiting for analysis.
  • max_concurrent_tasks - Instead of limiting the preloaded data by size, you can limit the amount of simultaneously loaded files using this attribute. Note that you can not use both max_loaded_targets_size and max_concurrent_tasks at the same time.
  • combined_analysis - by default, Binary Valentine performs both single-executable and cross-executable analysis. Cross-executable rules detect issues across all executables which are being scanned by a project. For example, there are rules which detect PE version information inconsistencies across all analyzed files. This attribute can be used to turn the combined analysis off (by specifying false or 0 as its value).

Global selector

<global_selector> is an optional element, which can be used to specify the global filters for the output reports. Using this element, you can filter out particular rules, issue levels and categories, as well as do rule-specific filtering by report messages. This element has no attributes. All supported nested elements are optional.

Reports selection

<exclude_reports> and <include_reports> are two optional elements of <global_selector>. To filter out some reports, use the <exclude_reports> element. Alternatively, to include only the specified reports, use the <include_reports> option. For example, if you need to only detect Portable Executable files with the absent relocation directory or disabled SAFESEH, you can use the following set up:

...
    <exclude_reports>
      <report>PE007</report>
      <report>PE009</report>
    </exclude_reports>
...

Here, PE007 is the Relocations are absent rule, and PE009 is the SAFESEH is disabled rule. You can list all rules by using the --list-reports command line option (see Command line). Both <exclude_reports> and <include_reports> can contain one or more child <report> elements. Note that you can not use both <exclude_reports> and <include_reports> at the same time.

Issue levels and categories selection

To exclude some issue levels or categories from the output, use the <exclude_levels> and <exclude_categories> elements. By default, Binary Valentine reports all levels (info, warning, error, critical) and categories (system, optimization, security, configuration, format). The <exclude_levels> element can contain one or more nested <level> elements, the <exclude_categories> element can contain one or more nested <category> elements (see the main example above).

Report filtering by report output

Binary Valentine XML projects support fine-grained report filtering by report output. Some rules output messages with one or more arguments (see the per-rule documentation to learn about each report output). This can be used to filter reports by the output the rule produces.

Let us consider that your goal is to detect usage of all deprecated WinAPI functions in the Porable Executable files of your project. However, there are still some old WinAPI functions which you can not get rid of currently, and you would like to filter them out from the Binary Valentine output. This is easily achievable with the <report_filter> element (you can include several such element inside <global_selector>):

...
    <report_filter report="PE082" selection_mode="exclude" aggregation_mode="any">
      <regex arg="dll">kernel32.dll</regex>
      <regex arg="api">RegOpenKey[AW]</regex>
    </report_filter>
...

In this example, the filter target is the PE082 rule, which is the "Deprecated WinAPI import" rule. The selection_mode attribute specifies the filter mode (either exclude reports with the matching output, or include only matching reports and exclude everything else). The aggregation_mode attribute specifies the filter matching process (any - the filter will match if any of the nested regular expressions match the report output; all - the filter will match if all of the nested regular expressions match the report output).

The nested <regex> elements specify the case-sensitive ECMAScript regular expressions. In the example above, the filter will match if any of the two included regular expressions match the report output. The first one matches the dll report output argument, the second one - the api report output argument.

Overall, you can read this report_filter as follows: exclude all reports from the PE082 (Deprecated WinAPI import) rule if any of the following regular expressions match the report output:

  1. The DLL name matches kernel32.dll. This will filter out all deprecated WinAPI imports from kernel32.dll.
  2. The API name matches RegOpenKey[AW]. This will additionally filter out functions with names which match the RegOpenKey[AW] regular expression (e.g. RegOpenKeyA, RegOpenKeyW).

If the executable you are analyzing imports some deprecated WinAPI functions from kernel32.dll, and also RegOpenKeyA together with QueryTraceW from advapi32.dll, Binary Valentine will report only the use of QueryTraceW, because other deprecated functions will be filtered out by the <report_filter>.

Targets to analyse

The required <targets> element is used to specify the targets to scan. This element must contain at least one nested <target> element.

Path

The <target> element must contain the path attribute, which should point to either a file or a directory with files to analyze. In the example above, the executables directory and the additional/binary.exe file will be analyzed. As the root path is set to C:/my_project, these paths will be translated to absolute ones C:/my_project/executables/ and C:/my_project/additional/binary.exe.

The <target> element may contain an optional recursive attribute with a boolean value (true, false, 1 or 0). It sets if a directory specified in the path attribute should be traversed recursively. This option is set to true by default.

Target-specific rule selector

There is an optional <rule_selector> element which could be included in <target>. If specified, this selector replaces the global one specified in the <global_selector> element and applies only to the current target. The format of the <rule_selector> element is identical to the <global_selector> format.

Target directory file name filter

If a <target> points to a directory, you can add a nested <filter> element which would filter the file names to analyze when traversing the directory. The <filter> element should contain one or more <include> and/or <exclude> nested elements. Both elements specify the case-sensitive ECMAScript regular expressions, which will be matched agains the absolute paths to the files in the directory. If an absolute file path matches all regular expressions specified in the <include> elements and does not match any regular expressions specified in the <exclude> elements, the file will be analyzed. Otherwise, it will be skipped.

In the example above, only the files with absolute paths ending with dll and not containing api-ms- will be analyzed.

Output reports

The required <reports> element is used to specify the output reports options. Binary Valentine supports several output formats and can in addition output detected issues to the terminal. The <reports> element can contain one or more nested elements described below.

Terminal output

Add an empty <terminal /> element to <reports> if you want to enable issue reporting to the terminal.

Plaintext and SARIF reports

You can add as many reports in plaintext and SARIF formats as you want by using the <text> and <sarif> elements. Both elements specify the path to the report. In the example above, a report.txt plaintext report and a report.sarif.json SARIF report will be created in the C:/my_project directory after analysis completes. Reports will always have the UTF-8 encoding.

Return codes

Binary Valentine returns several different return codes when it completes the analysis using the XML project. See the meaning of these return codes in the Command line description.