Skip to content

Dear Internet Explorer user: Your browser is no longer supported

Please switch to a modern browser such as Microsoft Edge, Mozilla Firefox or Google Chrome to view this website's content.

Download all PDFs from a web page via PowerShell

Use PowerShell to download every PDF from a web page and save them to a nominated folder.

I recently needed to download a lot of PDF documents from a web page and thought that I’d get PowerShell to do the hard work for me.

The following PowerShell code will open a folder dialogue box and then download every PDF within the web page specified in the code to the designated folder.

function Grab-PDFs {
    [Reflection.Assembly]::LoadWithPartialName("System.Windows.Forms") | Out-Null
    [System.Windows.Forms.Application]::EnableVisualStyles()
    $browse = New-Object System.Windows.Forms.FolderBrowserDialog
    $browse.SelectedPath = "C:\"
    $browse.ShowNewFolderButton = $false
    $browse.Description = "Select a directory"

    $loop = $true
    while($loop)
    {
        if ($browse.ShowDialog() -eq "OK")
        {
        $loop = $false
		
		cd $browse.SelectedPath
		
		# Grab PDFs from web page
		
		$psPage = Invoke-WebRequest "http://www.example.com/path/to/pdfs"
		$urls = $psPage.ParsedHtml.getElementsByTagName("A") | ? {$_.href -like "*.pdf"} | Select-Object -ExpandProperty href

		$urls | ForEach-Object {Invoke-WebRequest -Uri $_ -OutFile ($_ | Split-Path -Leaf)}
		
		Write-Host "... PDF downloading is complete." 
		[System.Windows.Forms.MessageBox]::Show("Your PDFs have been downloaded.", "Job Complete")
		
        } else
        {
            $res = [System.Windows.Forms.MessageBox]::Show("You clicked Cancel. Would you like to try again or exit?", "Select a location", [System.Windows.Forms.MessageBoxButtons]::RetryCancel)
            if($res -eq "Cancel")
            {
                #Ends script
                return
            }
        }
    }
    $browse.SelectedPath
    $browse.Dispose()
} Grab-PDFs

This code has also been uploaded to Github Gist.

   

Comments

4 responses to “Download all PDFs from a web page via PowerShell”

On 28 May 2018, Anbu wrote: Hyperlink chain icon

how do i use the same script/logic to download Folder from URL….also if that can be taken from the user, it would be far more better

Reply

On 12 March 2019, Jason wrote: Hyperlink chain icon

This is great, I was curious though if this can be used recursively to pull files from links further down.

thanks

Reply

On 13 March 2019, Jason wrote: Hyperlink chain icon

One other question, how do you get around the issue in invoke-webrequest for %20?

Reply

On 25 June 2020, jersam wrote: Hyperlink chain icon

I get the following.

Invoke-WebRequest : The URI prefix is not recognized.
At C:\Users\Administrator\Desktop\PDF_Grabber.ps1:25 char:27
+ … ach-Object {Invoke-WebRequest -Uri $_ -OutFile ($_ | Split-Path -Leaf …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotImplemented: (:) [Invoke-WebRequest], NotSupportedException
+ FullyQualifiedErrorId : WebCmdletIEDomNotSupportedException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

Reply

Have Your Say

The following HTML is permitted:
<a href="" title=""> <b> <blockquote cite=""> <code> <em> <i> <q cite=""> <strike> <strong>

Comments will be published subject to the Editorial Policy.