pdfgrep

From RaySoft

pdfgrep is a commandline utility to search text in PDF files.[1]

Documentation

Further infomration

Syntax

pdfgrep [PARAMETER ...] PATTERN [FILE ...]

Parameters

Matching Control

-e PATTERN, --regexp=PATTERN
Use PATTERN as the pattern to search for. If this option is specified multiple times or combined with --file, all patterns are tried in turn until one of them matches.
-f FILE, --file=FILE
Read patterns from FILE, one per line. If FILE contains multiple patterns or if this option is applied multiple times or combined with --regexp, all patterns are tried in turn until one of them matches. An empty pattern list matches nothing.
-i, --ignore-case
Ignore case distinctions in both the PATTERN and the input files.

General Output Control

-c, --count
Suppress normal output. Instead print the number of matches for each input file.
NOTE:
Unlike grep, multiple matches on the same page will be counted individually.
-p, --page-count
Like --count, but prints the number of matches per page. Implies --page-number.
--color=WHEN
Surround file names, page numbers and matched text with escape sequences to display them in color on the terminal. WHEN can be:
always
Always use colors, even when stdout is not a terminal.
never
Do not use colors.
auto
Use colors only when stdout is a terminal (this is the default).
-l, --files-with-matches
Suppress normal output. Instead print the name of each input file that contains a match. This works well with --null, but many other output options like --page-number or --count are ignored when --files-with-matches is specified.
-L, --files-without-match
Suppress normal output. Instead print the name of each input file that doesn’t contain a match. This works well with --null, but many other output options like --page-number or --count are ignored when --files-without-match is specified.
-m NUMBER, --max-count=NUMBER
Stop reading a file after NUMBER matches. When the --count option is also used, pdfgrep does not output a count greater than NUMBER.
-o, --only-matching
Print only the matched part of a line without any surrounding context.
-q, --quiet
Suppress all normal output to stdout. Exit immediately with exit status 0 if a match is found, even in case of errors. Use this if you only care about the presence of matches, not their number or content.

Line Prefix Control

-Z, --null
Output a null byte (called NUL in ASCII and \0 in C) instead of the colon that usually separates a filename from the rest of the line. This option makes the output unambiguous in the presence of colons, spaces or newlines in the filename. It can be used in conjunction with commands such as xargs -0 or perl -0.

File Selection

-r, --recursive
Recursively search all files (restricted by --include and --exclude) under each directory, following symlink only if they are on the command line.

References