Skip to content

Commit

Permalink
minor markdown updates and fix single file access error
Browse files Browse the repository at this point in the history
  • Loading branch information
asteiker committed Nov 7, 2023
1 parent eca6258 commit f920836
Showing 1 changed file with 86 additions and 29 deletions.
115 changes: 86 additions & 29 deletions notebooks/NOAA_Access/Python_download_NOAA_NSIDC_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@
"metadata": {},
"source": [
"## 1. Tutorial Overview \n",
"This notebook demonstrates how to download NOAA@NSIDC data using python, it includes examples for downloading a single file and all the files in a directory.\n",
"This notebook demonstrates how to download NOAA@NSIDC data using python. It includes examples for downloading a single file and all the files in a directory.\n",
"\n",
"### Credits \n",
"This notebook was developed by Jennifer Roebuck of NSIDC\n",
"This notebook was developed by Jennifer Roebuck of NSIDC.\n",
"\n",
"For questions regarding the notebook or to report problems, please create a new issue in the [NSIDC-Data-Tutorials repo](https://github.com/nsidc/NSIDC-Data-Tutorials/issues)\n",
"\n",
Expand Down Expand Up @@ -51,13 +51,15 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#import the requests library \n",
"import requests\n",
"from bs4 import BeautifulSoup"
"from bs4 import BeautifulSoup #TBD Describe what htis library does. Do we need to add it to our support set??"
]
},
{
Expand All @@ -67,7 +69,7 @@
"### Downloading a single file\n",
"This demonstrates how to download a single file.\n",
"\n",
"First we need to set the URL of the file we wish to download. The URL will follow the format of: https://noaadata.apps.nsidc.org/NOAA/ \\<path to data set and file\\>\n",
"First we need to set the URL of the file we wish to download. The URL will follow the format of: `https://noaadata.apps.nsidc.org/NOAA/<path to data set and file>`\n",
"\n",
"where \\<path to data set and file\\> is specific to the data set and can be determined by exploring https://noaadata.apps.nsidc.org in a web browser. \n",
"\n",
Expand All @@ -76,8 +78,10 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#URL of the file \n",
Expand All @@ -88,21 +92,24 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we need to create a HTTPS response object for that URL using the `get` method from the `requests` library."
"Next we need to create a HTTPS response object for that URL using the `get` method from the `requests` library. We will raise an exception if the response returns an error."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#Create a HTTPS response object\n",
"r = requests.get(file_url)\n",
"\n",
"#Catch any HTTPS errors\n",
"r.raise_for_status()\n",
"except requests.exceptions.HTTPError as err:\n",
" \n",
"try:\n",
" r = requests.get(file_url)\n",
" r.raise_for_status()\n",
"except requests.exceptions.RequestException as err:\n",
" raise SystemExit(err)"
]
},
Expand All @@ -115,8 +122,10 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 6,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#Download and save the file\n",
Expand All @@ -138,8 +147,10 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"#Set the URL of the directory we wish to download all the files from\n",
Expand All @@ -157,29 +168,75 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"execution_count": 13,
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"<html>\n",
"<head><title>Index of /NOAA/G02135/north/daily/geotiff/1978/10_Oct/</title></head>\n",
"<body>\n",
"<h1>Index of /NOAA/G02135/north/daily/geotiff/1978/10_Oct/</h1><hr/><pre><a href=\"../\">../</a>\n",
"<a href=\"N_19781026_concentration_v3.0.tif\">N_19781026_concentration_v3.0.tif</a> 21-Jul-2017 03:51 666434\n",
"<a href=\"N_19781026_extent_v3.0.tif\">N_19781026_extent_v3.0.tif</a> 21-Jul-2017 12:25 138426\n",
"<a href=\"N_19781028_concentration_v3.0.tif\">N_19781028_concentration_v3.0.tif</a> 21-Jul-2017 03:51 666434\n",
"<a href=\"N_19781028_extent_v3.0.tif\">N_19781028_extent_v3.0.tif</a> 21-Jul-2017 12:25 138426\n",
"<a href=\"N_19781030_concentration_v3.0.tif\">N_19781030_concentration_v3.0.tif</a> 21-Jul-2017 03:51 666434\n",
"<a href=\"N_19781030_extent_v3.0.tif\">N_19781030_extent_v3.0.tif</a> 21-Jul-2017 12:25 138426\n",
"</pre><hr/></body>\n",
"</html>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Create a HTTPS response object\n",
"r = requests.get(archive_url)\n",
"\n",
"#Use BeautifulSoup to get a list of the filenames in the directory\n",
"data = BeautifulSoup(r.text, \"html.parser\")\n"
"#Use BeautifulSoup to get a list of the files in the directory\n",
"data = BeautifulSoup(r.text, \"html.parser\")\n",
"data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we will create a URL for each of the filenames, set filenames that we want to save the downloaded files as, and download the files. "
"Now we will create a URL for each of the files, set filenames for each of our downloaded files, and download the files. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"execution_count": 16,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"200\n",
"N_19781026_concentration_v3.0.tif\n",
"200\n",
"N_19781026_extent_v3.0.tif\n",
"200\n",
"N_19781028_concentration_v3.0.tif\n",
"200\n",
"N_19781028_extent_v3.0.tif\n",
"200\n",
"N_19781030_concentration_v3.0.tif\n",
"200\n",
"N_19781030_extent_v3.0.tif\n"
]
}
],
"source": [
"#Loop through the list of the html links (excluding the first one which is just a link to the previous directory)\n",
"for l in data.find_all(\"a\")[1:]:\n",
Expand Down Expand Up @@ -227,7 +284,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.15"
"version": "3.10.12"
}
},
"nbformat": 4,
Expand Down

0 comments on commit f920836

Please sign in to comment.