diff --git a/README.md b/README.md index 08ae697..99018b5 100644 --- a/README.md +++ b/README.md @@ -2,13 +2,47 @@ - Tracked in our central issue repository: [Geo Issues](https://github.com/openstates/issues/labels/component%3Ageo) +# Geo lookup endpoint (v3-district-geo) + +A small part of the code here, in the `/endpoint` directory, comprises a lambda function that is deployed as an AWS +lambda called `v3-district-geo` (in the openstates account). This function does geographic-specific querying. At this +point I don't remember why it is a separately deployed endpoint (maybe because API v2 was sharing it under the hood?). + +## Upgrading Python version/runtime + associated layer + +Occasionally the Python runtime needs to be upgraded. This also involves updating the [Lambda Layer](https://docs.aws.amazon.com/lambda/latest/dg/packaging-layers.html) +that the function depends on to provide its psycopg2 dependency. + +To build and create a new layer: + +* I used the `aws-psycopg2` package to obtain a version of psycopg2 compiled for the AWS environment +* Change directory to the `endpoint` directory +* Create a folder called `python` +* Install the dependency to the folder: `pip install --target ./python aws-psycopg2` +* Package up the folder as a zip: `zip -r python39awspscycopg2.zip python` +* Use the [AWS console to upload the new layer](https://us-east-1.console.aws.amazon.com/lambda/home?region=us-east-1#/layers) + (or add it as a new version to an existing layer) + +To upgrade the Python version/runtime of the function: + +* Click the "Layers" icon in the [AWS Lambda console UI](https://us-east-1.console.aws.amazon.com/lambda/home?region=us-east-1#/functions/v3-district-geo?tab=code) +* Click the "Edit" button +* Choose the existing layer (associated with the previous python runtime/version) and Delete it +* Go back to the `v3-district-geo` function, scroll down to "Runtime Settings" and click "Edit" +* Change the python version +* Now go back to the Layers section and use Add a Layer to associate the layer that is compatible with that python + version + # Open States Geography Processing & Server -Generate and upload map tiles for the state-level legislative district maps on [openstates.org](https://openstates.org/), both for [state overviews](https://openstates.org/ca/) and for [individual legislators](https://openstates.org/person/tim-ashe-4mV4UFZqI2WsxsnYXLM8Vb/). +Generate and upload map tiles for the state-level legislative district maps +on [openstates.org](https://openstates.org/), both for [state overviews](https://openstates.org/ca/) and +for [individual legislators](https://openstates.org/person/tim-ashe-4mV4UFZqI2WsxsnYXLM8Vb/). -- Source: SLDL and SLDU shapefiles from [the Census's TIGER/Line database](https://www.census.gov/geo/maps-data/data/tiger-line.html) +- Source: SLDL and SLDU shapefiles + from [the Census's TIGER/Line database](https://www.census.gov/geo/maps-data/data/tiger-line.html) - Output: a single nationwide MBTiles vector tile set, uploaded to Mapbox for hosting - - Intermediate files are also built and retained locally, stored in the `data` directory for debugging + - Intermediate files are also built and retained locally, stored in the `data` directory for debugging ![](tileset-screenshot.png) @@ -22,15 +56,20 @@ Generate and upload map tiles for the state-level legislative district maps on [ We download our shapefiles from [census.gov](https://www2.census.gov/geo/tiger). -The organization of files within TIGER's site means that we may have to change the layout of downloaded files from year to year (in `utils/tiger.py`). As long as we consistently add proper files into `data/source_cache` for the rest of the scripts to process, changing the initial download location shouldn't matter. +The organization of files within TIGER's site means that we may have to change the layout of downloaded files from year +to year (in `utils/tiger.py`). As long as we consistently add proper files into `data/source_cache` for the rest of the +scripts to process, changing the initial download location shouldn't matter. See Appendix A below on Geographic Data Sources for more context. -You'll probably want to remove any cached files in `./data/`. The download tool may try to re-use cached files from the wrong year if they still exist. (We don't manually remove these files because you may need to re-run the scripts, and skipping downloads is useful) +You'll probably want to remove any cached files in `./data/`. The download tool may try to re-use cached files from the +wrong year if they still exist. (We don't manually remove these files because you may need to re-run the scripts, and +skipping downloads is useful) ### National Boundary Update -`config/settings.yml` holds the `BOUNDARY_YEAR` config. This setting defines what to apply to our US boundary template link: +`config/settings.yml` holds the `BOUNDARY_YEAR` config. This setting defines what to apply to our US boundary template +link: ```python f"{TIGER_ROOT}/GENZ{boundary_year}/shp/cb_{boundary_year}_us_nation_5m.zip" @@ -40,16 +79,18 @@ We should verify/update this setting to the most recently available boundary yea ### Note on file naming -You'll see many files with names like `sldu`, `sldl` or `cd` during this process. Here is a quick layout of what those file name abbreviations mean: +You'll see many files with names like `sldu`, `sldl` or `cd` during this process. Here is a quick layout of what those +file name abbreviations mean: - `sldu` - - State Level District Upper -> Upper Chamber District boundaries + - State Level District Upper -> Upper Chamber District boundaries - `sldl` - - State Level District Lower -> Lower Chamber District boundaries + - State Level District Lower -> Lower Chamber District boundaries - `cd` - - Congressional District -> Federal Congressional District boundaries + - Congressional District -> Federal Congressional District boundaries -We do not collect boundaries for Federal Senate because each state has the same number of senators and they are considered "at-large" (having no district boundaries beyond the entire state). +We do not collect boundaries for Federal Senate because each state has the same number of senators and they are +considered "at-large" (having no district boundaries beyond the entire state). ## Running @@ -57,22 +98,25 @@ There are several steps, which typically need to be run in order: 1) Setup Poetry: - - `poetry install` +- `poetry install` 2 ) Make sure environment variables are set correctly: - - `DATABASE_URL`: pointing at either the `geo` database in production or to a local copy, e.g. `DATABASE_URL=postgis://:@/geo` - - `MAPBOX_ACCESS_TOKEN`: a API token for Mapbox with permissions to upload tilesets - - `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`: AWS credentials to upload bulk versions of geo data +- `DATABASE_URL`: pointing at either the `geo` database in production or to a local copy, + e.g. `DATABASE_URL=postgis://:@/geo` +- `MAPBOX_ACCESS_TOKEN`: a API token for Mapbox with permissions to upload tilesets +- `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`: AWS credentials to upload bulk versions of geo data 3) Download and format geo data: - - `poetry run python generate-geo-data.py --run-migrations --upload-data` - - Note that this script does not fail on individual download failures. If you see failures in the run, make sure they are expected (e.g. NE/DC lower should fail) +- `poetry run python generate-geo-data.py --run-migrations --upload-data` + - Note that this script does not fail on individual download failures. If you see failures in the run, make sure + they are expected (e.g. NE/DC lower should fail) ### Setting up environment variables -There are plenty of ways to set environment variables, but quick way to manage many environment variables is with an "environment file". e.g. +There are plenty of ways to set environment variables, but quick way to manage many environment variables is with an " +environment file". e.g. ```bash AWS_ACCESS_KEY_ID="user" @@ -93,9 +137,11 @@ After that, we can easily load the file: ### Running within Docker -Instead of setting up your local environment you can instead run using Docker. Using Docker Compose will still allow you to access all intermediate files from the processing, within your local `data` directory. +Instead of setting up your local environment you can instead run using Docker. Using Docker Compose will still allow you +to access all intermediate files from the processing, within your local `data` directory. -Build and run with Docker Compose. Similar to running without Docker, environment variables must be set in your local environment. +Build and run with Docker Compose. Similar to running without Docker, environment variables must be set in your local +environment. ``` docker-compose up make-tiles @@ -105,17 +151,21 @@ docker-compose up make-tiles openstates-geo works with shapefiles. Shapefiles can be opened by a tool called [qgis](https://www.qgis.org/en/site/) For example, to inspect a source shapefile, such as `tl_2022_01_sldl.shp`, open up qgis and navigate to the folder where -that file resides. Open the file, it should appear in the main pane as a map. Use the "Select Features by Area or single click" +that file resides. Open the file, it should appear in the main pane as a map. Use the "Select Features by Area or single +click" button in the toolbar, and then select a district. Metadata should appear in the right pane. ## US Census ### Redistricting -During the next major sessions after a Census (e.g. 2022 was the major session for _most_ jurisdictions after the 2020 Census), the TIGER data we rely on may be significantly "behind" reality as the example note from 2022 indicates: +During the next major sessions after a Census (e.g. 2022 was the major session for _most_ jurisdictions after the 2020 +Census), the TIGER data we rely on may be significantly "behind" reality as the example note from 2022 indicates: -> "We hold the districts used for the 2018 election until we collect the postcensal congressional and state legislative district plans -> for the 118th CongressĀ and year 2022 state legislatures" [US Census CD/SLD note](https://www.census.gov/programs-surveys/geography/technical-documentation/user-note/cd-sld-note.html) +> "We hold the districts used for the 2018 election until we collect the postcensal congressional and state legislative +> district plans +> for the 118th Congress and year 2022 state +> legislatures" [US Census CD/SLD note](https://www.census.gov/programs-surveys/geography/technical-documentation/user-note/cd-sld-note.html) As of 2022, TIGER was still the most consistent data source for district boundaries we were able to find. diff --git a/poetry.lock b/poetry.lock index 6da6327..b41eca1 100644 --- a/poetry.lock +++ b/poetry.lock @@ -1548,6 +1548,7 @@ files = [ {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:69b023b2b4daa7548bcfbd4aa3da05b3a74b772db9e23b982788168117739938"}, {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:81e0b275a9ecc9c0c0c07b4b90ba548307583c125f54d5b6946cfee6360c733d"}, {file = "PyYAML-6.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ba336e390cd8e4d1739f42dfe9bb83a3cc2e80f567d8805e11b46f4a943f5515"}, + {file = "PyYAML-6.0.1-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:326c013efe8048858a6d312ddd31d56e468118ad4cdeda36c719bf5bb6192290"}, {file = "PyYAML-6.0.1-cp310-cp310-win32.whl", hash = "sha256:bd4af7373a854424dabd882decdc5579653d7868b8fb26dc7d0e99f823aa5924"}, {file = "PyYAML-6.0.1-cp310-cp310-win_amd64.whl", hash = "sha256:fd1592b3fdf65fff2ad0004b5e363300ef59ced41c2e6b3a99d4089fa8c5435d"}, {file = "PyYAML-6.0.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:6965a7bc3cf88e5a1c3bd2e0b5c22f8d677dc88a455344035f03399034eb3007"}, @@ -1555,8 +1556,15 @@ files = [ {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:42f8152b8dbc4fe7d96729ec2b99c7097d656dc1213a3229ca5383f973a5ed6d"}, {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:062582fca9fabdd2c8b54a3ef1c978d786e0f6b3a1510e0ac93ef59e0ddae2bc"}, {file = "PyYAML-6.0.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d2b04aac4d386b172d5b9692e2d2da8de7bfb6c387fa4f801fbf6fb2e6ba4673"}, + {file = "PyYAML-6.0.1-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:e7d73685e87afe9f3b36c799222440d6cf362062f78be1013661b00c5c6f678b"}, {file = "PyYAML-6.0.1-cp311-cp311-win32.whl", hash = "sha256:1635fd110e8d85d55237ab316b5b011de701ea0f29d07611174a1b42f1444741"}, {file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"}, + {file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"}, + {file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"}, + {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6c22bec3fbe2524cde73d7ada88f6566758a8f7227bfbf93a408a9d86bcc12a0"}, + {file = "PyYAML-6.0.1-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:8d4e9c88387b0f5c7d5f281e55304de64cf7f9c0021a3525bd3b1c542da3b0e4"}, + {file = "PyYAML-6.0.1-cp312-cp312-win32.whl", hash = "sha256:d483d2cdf104e7c9fa60c544d92981f12ad66a457afae824d146093b8c294c54"}, + {file = "PyYAML-6.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:0d3304d8c0adc42be59c5f8a4d9e3d7379e6955ad754aa9d6ab7a398b59dd1df"}, {file = "PyYAML-6.0.1-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:50550eb667afee136e9a77d6dc71ae76a44df8b3e51e41b77f6de2932bfe0f47"}, {file = "PyYAML-6.0.1-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1fe35611261b29bd1de0070f0b2f47cb6ff71fa6595c077e42bd0c419fa27b98"}, {file = "PyYAML-6.0.1-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:704219a11b772aea0d8ecd7058d0082713c3562b4e271b849ad7dc4a5c90c13c"}, @@ -1573,6 +1581,7 @@ files = [ {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a0cd17c15d3bb3fa06978b4e8958dcdc6e0174ccea823003a106c7d4d7899ac5"}, {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:28c119d996beec18c05208a8bd78cbe4007878c6dd15091efb73a30e90539696"}, {file = "PyYAML-6.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7e07cbde391ba96ab58e532ff4803f79c4129397514e1413a7dc761ccd755735"}, + {file = "PyYAML-6.0.1-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:49a183be227561de579b4a36efbb21b3eab9651dd81b1858589f796549873dd6"}, {file = "PyYAML-6.0.1-cp38-cp38-win32.whl", hash = "sha256:184c5108a2aca3c5b3d3bf9395d50893a7ab82a38004c8f61c258d4428e80206"}, {file = "PyYAML-6.0.1-cp38-cp38-win_amd64.whl", hash = "sha256:1e2722cc9fbb45d9b87631ac70924c11d3a401b2d7f410cc0e3bbf249f2dca62"}, {file = "PyYAML-6.0.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:9eb6caa9a297fc2c2fb8862bc5370d0303ddba53ba97e71f08023b6cd73d16a8"}, @@ -1580,6 +1589,7 @@ files = [ {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5773183b6446b2c99bb77e77595dd486303b4faab2b086e7b17bc6bef28865f6"}, {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:b786eecbdf8499b9ca1d697215862083bd6d2a99965554781d0d8d1ad31e13a0"}, {file = "PyYAML-6.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc1bf2925a1ecd43da378f4db9e4f799775d6367bdb94671027b73b393a7c42c"}, + {file = "PyYAML-6.0.1-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:04ac92ad1925b2cff1db0cfebffb6ffc43457495c9b3c39d3fcae417d7125dc5"}, {file = "PyYAML-6.0.1-cp39-cp39-win32.whl", hash = "sha256:faca3bdcf85b2fc05d06ff3fbc1f83e1391b3e724afa3feba7d13eeab355484c"}, {file = "PyYAML-6.0.1-cp39-cp39-win_amd64.whl", hash = "sha256:510c9deebc5c0225e8c96813043e62b680ba2f9c50a08d3724c7f28a747d1486"}, {file = "PyYAML-6.0.1.tar.gz", hash = "sha256:bfdf460b1736c775f2ba9f6a92bca30bc2095067b8a9d77876d1fad6cc3b4a43"},