Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get mutations via API #20

Open
krlmlr opened this issue Mar 19, 2021 · 9 comments
Open

Get mutations via API #20

krlmlr opened this issue Mar 19, 2021 · 9 comments
Milestone

Comments

@krlmlr
Copy link
Collaborator

krlmlr commented Mar 19, 2021

https://www.bfs.admin.ch/bfs/de/home/dienstleistungen/forschung/api/api-gemeinde.assetdetail.15224054.html

Also check how zazuko is doing it. Old way: zazuko/fso-lod@e8f08de, maybe this has improved by now?

@ThomasKnecht ThomasKnecht self-assigned this Mar 26, 2021
@ThomasKnecht
Copy link
Contributor

ThomasKnecht commented Mar 26, 2021

@krlmlr -> see branch: https://github.com/cynkra/SwissCommunes/tree/f-20-download-with-api

We can download the data with the api, but some variables are missing such as: mAbolitionDate as well as the mAdmissionDate.

Furthermore the download takes a long time and it returns everything as a single character string.

I added a method which is also used by swissdata. It uses the BFS_NR of the file to scrape the asset number which allows us to download the same files structure as we had before.

I prefer this method, since we get them in the same structure and all the variables are present.
Currently the code is just scripted and not yet in a function.

What is your opinion about the download method?

@krlmlr
Copy link
Collaborator Author

krlmlr commented Mar 26, 2021

Thanks. If we can reliably get the asset ID I'm fine with that approach. Basically, I'm fine with anything we can stuff into a GitHub Action ;-)

data-raw/update-data.R has most of the updating code, no need to duplicate it. Can you please add detection of the asset ID to that file in a separate PR? I'll take a closer look at this PR later; the API might be producing something that resembles the format we're computing with swc_get_mutations() . This means that we may be able to get rid of even more code.

@ThomasKnecht
Copy link
Contributor

ThomasKnecht commented Mar 26, 2021

You are right, the api data has similar structure to our data after saw_get_mutations(), but only the mutation date is given.
But maybe it is useful like that. Wee need to check the downstream functions to check if we can work with this format.

krlmlr added a commit that referenced this issue Mar 27, 2021
- `swc_read_data()` now always fetches the most recent dataset (#20).
@krlmlr
Copy link
Collaborator Author

krlmlr commented Mar 27, 2021

A useful next step would be to write the result of swc_get_mutations() to a CSV file and to daff it with the results of the API (after column selection and renaming).

@krlmlr
Copy link
Collaborator Author

krlmlr commented Nov 30, 2021

We now have overwrite_data(), perhaps it does the job?

@TSchiefer
Copy link
Member

overwrite_data() calls swc_read_data() which in the end gets the current mutation data, but not from the API, but currently e.g. from https://www.bfs.admin.ch/bfsstatic/dam/assets/17884689/master (the asset number 17884689 is taken from scraping a permanent link)
The mutations are then written to the csv-files.

swc_get_mutations() reads the csv-file(s).

I don't think the current approach of overwrite_data() is so bad.
If we want to use the API though, I will have a look into https://github.com/cynkra/munch/tree/f-20-download-with-api to check once more if we can maybe use the API to be more flexible.

@salim-b
Copy link

salim-b commented Mar 10, 2022

Hey there, I just randomly stumbled upon this repo. If you'd like to work with the API mentioned above: I've written an R package (AGPL-3+) named swissmuni a while ago which allows to access all endpoints and offers as much documentation as I could put together (I've updated the doc just now since I only now learned about the PDF available here which wasn't around two years ago).

As you already noticed, the API is very slow. Thus I've added a caching mechanism (building upon pkgpins). Be aware that the caching might not work as intended on Windows (cache lost after R session restart I think), I still have to investigate that. Besides that, everything else should be stable.

I never got around to publish the package on CRAN, but you can simply install it using remotes::install_gitlab("salim_b/r/pkgs/swissmuni").

@krlmlr
Copy link
Collaborator Author

krlmlr commented Mar 25, 2022

Thanks! The API seems to deliver data that looks slightly different from what we are processing internally. It would be great to have a closer look to understand the differences and perhaps consolidate.

@krlmlr krlmlr modified the milestones: 0.2.0, 0.4.0 May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants