Here is a crude method that works sometimes: Code: #!/bin/bash if [[ $# -ne 1 ]] then echo 1>&2 "Usage: $0 file.pdf finds the pdf file's title, and outputs it to the standard output" exit 1 fi pdftotext -layout "${1}" - | perl -n -e 'use warnings; use strict; our $processingTitle; our @title; my $l = $_; $l =~ s/^\f//; # remove feeds if ( $processingTitle ) { if ( $l =~ /^\s+$/ ) { last; } my ( $parts ) = $l =~ /^\s*(\S.+)/; $parts =~ s/\s+$//; push (@title, $parts); } elsif ( $l =~ /^\s*(\S.+)/ ) { $processingTitle = 1; my $parts = $1; $parts =~ s/\s+$//; push (@title, $parts); } END { if ( scalar(@title) == 0 ) { print "NO TITLE FOUND\n"; } else { print join(" ", @title), "\n"; } }' For example, assuming the environment has bash, pdftotext, and perl: https://assets.super.so/e46b77e7-ee...iles/2f2fc428-925c-4041-9f6d-bc387d904820.pdf Code: $ findpdftitle 2f2fc428-925c-4041-9f6d-bc387d904820.pdf Commodity Option Implied Volatilities and the Expected Futures Returns
Until some new nick with 3 posts and 0 likes shows up and open up a new thread asking "so, I have a friend that needs coding help" lol
When saving such PDF downloads, I always also save the web page where I found it. Web pages of course have a filename as well, and this one mostly is a real title, not something cyptic like 194567.pdf . Now, later when I watch the directory listing sorted by datetime, I can see what the accompanying html doc was, and also open it in the browser, and so find out what the PDF is about, incl. title etc. and wherefrom I did download it...