Post

Tool - KeyFind

Variant based keyword finder.

Tool - KeyFind

Overview

This Python script allows you to input a list of keywords, generate corresponding variants based on your specifications, and search for them within a file. It then returns each line that matches any keyword or its generated variant. Additionally, it provides a summary showing the frequency of each keyword/variant’s appearance throughout the file.

The results are neatly organized in a summary table, which can be easily exported into several formats, such as TXT, CSV, or JSON. You’ll also have access to various filtering and visualization functionalities to refine your analysis.

This tool was developed following my participation in irisCTF2025, particularly after working on the Windy Day Forensics challenge. For a deeper dive into my process, feel free to check out my writeup. The source code for this script is available on my Github.

Usage

To run the script, use the following command:

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS]
  • [SOURCE] refers to the file/directory where the search will be performed.
  • [KEYWORDS] specifies the list of keywords to search for.
  • [VARIANTS] defines the types of variants you want to generate and search for.

By default, the script outputs a summary detailing how many times each keyword or variant appears and lists all the lines where a match is found.

Currently, the script supports generating the following variants:

  • Hexadecimal,
  • Base64,
  • Caesar Cipher.

If there are any other variants you’d like to see supported, feel free to let me know! I continuously add new variants as I encounter them in my CTF challenges and the need arises.

Options

Summary

Option nameUsageDescription
Keywords-k [KEYS]Define the list of keywords to search for.
Variants formats-v [VARIANTS]Specify the types of variants to generate for each keyword.
Render format-r [FORMAT]Specify the output format for rendering the results.
Extract file-e [OUTFILE]Output filename to save the extracted results.
Parameters-p [PARAMETERS]Define an alternate table format for output.
Order-o [ORDER]Specify parameter to order result.
Invert order--inv_orderInvert the ordering of the results.
Caesar shift--caesar_shift [SHIFTS]Define the shifts to apply for generating Caesar cipher variants.
Table format--tblfmt [FORMAT]Specify an other table format for the output.
Minimum count--min [MIN]Set the minimum count threshold for a result to be included.
Maximum count--max [MAX]Set the maximum count threshold for a result to be included.
Remove duplicate--remove_duplicateRemove duplicate results from the output.
Case sensitive--case_sensitiveEnables case-sensitive keyword matching (not yet implemented).
Index--indexAdd an index when extracting results.
Hide zero--hide_zeroHide results where no matches are found for a keyword or its variants.
No plain--no_plainExcludes plain keywords defined by the user from being searched.
No space--no_spaceTrims leading and trailing spaces from the results before printing.
No summary--no_summaryDisable the display of a results summary.
Verbose--verboseEnable verbose mode for detailed output.
Quiet--quietSuppresses output, except for critical information.
Very Quiet--qquietSuppresses all output, including banners and other messages.
Time--timeDisplays the execution time of the script after completion.

Render

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] -r [RENDER]
  • Utility : Specifies the format in which the output will be rendered.
  • Choices : [RENDER] can be one of the following formats: txt, csv, json or tab.
  • Default : The default output format is csv.

Order

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] -o [ORDER] --inv_order
  • Utility : Defines how the search results will be ordered based on specific parameters.
  • Choices : The [ORDER] parameter can be one of keyword, source, or , which will sort the results in ascending order. Use --inv_order to invert the order and sort in descending order.
  • Default : By default there is no ordering.

Tip Sorting by the count parameter is particularly useful when you want to see which keywords have the most matches. You can combine it with the --inv_order option to display results in descending order, making it easier to spot the most frequent matches.

Parameters

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] -p [PARAMETERS]
  • Utility : Specifies which parameters to extract or display from the search results.
  • Choices : [PARAMETERS] can be one of the following: keyword, source or count.
  • Default : By default, all parameters are extracted and displayed..

Table format

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] --tblfmt [TABLE FORMAT]
  • Utility : Changes the format in which the result table is rendered.
  • Choices : See the Tabulate documentation for available table formats.
  • Default : The default table format is github.

Minimum & Maximum

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] --min [MIN] or --max [MAX]
  • Utility : Filters the results based on the count of matches.
  • Default : No filtering is applied by default.

Cesar shifts

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] --cesar_shift [SHIFTS]
  • Utility : Specifies the shifts to be applied when generating Caesar cipher variants.
  • Default : If not specified, the script will apply all possible shifts for Caesar cipher variants, typically shifting by all 26 possible positions for the alphabet.

Remove duplicates

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] --remove_duplicate
  • Utility : Removes duplicate results from the output. If enabled, only the first occurrence of each result will be kept, reducing the length of the output.
  • Default : This option is disabled by default. Enabling it may increase execution time due to additional processing.

Case senstive

Not implemented (yet)

Hide zero

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] --hide_zero
  • Utility : Hides keywords or variants that do not have any matches in the search. This helps produce cleaner and shorter results, highlighting only relevant data.
  • Default : By default, all keywords and variants, including those with zero matches, are displayed.

Other options

Add index

1
KeyFind.py [SOURCE] --index
  • Utility : Adds an index column when exporting results in CSV or JSON formats, helping you track the results more easily.
  • Default : The index is not included by default.

No plain

1
KeyFind.py [SOURCE] --no_plain
  • Utility : Excludes the plain user-defined keywords from the search, so only the generated variants are considered.
  • Default : By default, the plain user keywords are always included in the search.

No space

1
KeyFind.py [SOURCE] --no_space
  • Utility : Removes leading and trailing spaces from the output. This is especially useful when redirecting results to a file or processing them in scripts.
  • Default : By default, spaces are included in the output for better readability when displayed on the terminal.

No summary

1
KeyFind.py [SOURCE] --no_summary
  • Utility : Disables the printing of the summary results at the end of the execution.
  • Default : By default, the summary is printed after the search completes.

Verbose

1
KeyFind.py [SOURCE] -k [KEYWORDS] -v [VARIANTS] --verbose
  • Utility : Enables verbose mode, providing more detailed output during the script’s execution.
  • Default : Verbose mode is not enabled by default.

Quiet

1
KeyFind.py [SOURCE] --quiet and --qquiet 
  • Utility :
    • --quiet: Suppresses output except for critical information.
    • --qquiet: Completely suppresses all output, including the banner and any messages.
  • Default : By default, output is printed and the banner is displayed.

Example

To illustrate how the script works, let’s use a case I encountered during the irisCTF2025 challenge, Windy Day. After extracting strings from a Firefox process dump, I needed to search for the flag, which I knew started with irisctf.

Using my script, I was able to search for the flag quickly with the following command:

1
python3 KeyFind.py iris.txt -k irisctf -v user b64 --quiet

Output :

1
2
3
4
5
6
|     | keyword  | source        | count |
| --- | -------- | ------------- | ----- |
| 0   | irisctf  | user          | 0     |
| 1   | aXJpc2N0 | b64 (irisctf) | 1320  |

Total : 1320

As we can see, there is no match for the plain irisctf keyword, but we found a match for its Base64 variant -aXJpc2N0- with 1320 occurrences. Now, let’s refine the results by removing duplicates and excluding keywords with zero matches.

1
python3 KeyFind.py ../test/keyword.txt -k irisctf -v user b64 --quiet --remove_double --hide_zero

Output :

1
2
3
4
5
|     | keyword  | source        | count |
| --- | -------- | ------------- | ----- |
| 1   | aXJpc2N0 | b64 (irisctf) | 633   |

Total : 633

Now we have 633 occurrences of the Base64 variant, with duplicates removed and no zero matches. Next, we’ll extract the matching lines into a CSV file for further analysis using other tools. Since we only need the matching lines, we’ll only extract the result parameter:

1
python3 KeyFind.py ../test/keyword.txt -k irisctf -v user b64  -p result -r csv -e  ../out/keyfind.csv  --quiet --remove_double --hide_zero

CSV Output :

1
2
3
4
5
6
#
aXJpc2N0ZntpX2FtX2FuX2lkaW90X3dpdGhfYmFkX21lbW9yeX0=
aXJpc2N0ZntpX2FtX2FuX2lkaW90X3dpdGhfYmFkX21l
client=firefox-b-d&q=aXJpc2N0Z
www.axjpc2n0z
...

This gives us the Base64-encoded strings that matched our search, which can now be analyzed further using other tools.

Tip
If you identify that you’re working with Base64-encoded values, I’ve created another script called b64find. It simplifies the detection process and automates the decoding and extraction of Base64 values. You can find its source code on my Github.

Future Improvements

I plan to add more modules and variant capabilities to make this script an even more powerful keyword searching tool. Some features I’m considering include case sensitivity, searching within directories (not just individual files), and integrating Levenshtein distance algorithms to account for approximate string matching. Stay tuned for future updates!

Conclusion

This script allows you to quickly search for a keyword and its various possible variants. As I mentioned, this is a tool that I plan to continuously develop, adding new modules and functionalities as I encounter different use cases.

I hope you find it useful, and that the guide was clear enough to get you started. See you in the next one!

emree1

This post is licensed under CC BY 4.0 by the author.