-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: out-of-tree Pyodide builds in CI for pandas
#57891
Comments
I have experimented with this in a branch on my fork, where I have managed to bring the test suite down to just 31 failures by skipping certain tests and fixing a few other ones – I can open a draft PR for this for the purposes of visibility as of now, and work towards getting down to zero failures with the help and support of the maintainers and core developers. |
xref some other discussions about Pyodide as follows (some of them coming from the PyArrow requirement that revolves around PDEP-10):
The issue at hand with PyArrow as a required dependency is the increased size of the download, which brings negative implications for using |
I am not super optimistic about having to debug pyodide stuff in my PRs, I am really hesitant on adding official support for the platform |
Hi, @phofl – I totally understand that. It does come with its concerns related to maintenance; if patches might be required to disable certain functionality and debugging them is indeed painful. I would be happy to take on that sort of maintenance if needed, but realistically, yes, it won't be possible to do so for me or anyone else every time there is a failure or if any critical changes are required in a jiffy. I will keep my PR marked as draft for now, but I hope that the team changes their mind :) That being said, if it is ever enabled it will still be regarded as an "experimental" platform though, and it will stay that way for a long while until things in the Pyodide ecosystem become stable and we have a v1 release. But I suppose it also depends upon the popularity of the platform continues to evolve in the coming years. Here's what the scikit-learn developers think about that if you wish to have a read: scikit-learn/scikit-learn#23727 |
How far is Numpy's progress on this front? I would also be hesitant with officially supporting this too until pandas' required dependencies have WASM support (at least CI testing) |
numpy has a workflow testing this already |
Thank you, @lithomas1. Yes, @mroeschke, NumPy is currently testing against Pyodide for all PRs with the link to the workflow as shared above, restarting with this PR: numpy/numpy#25894 – thanks to @rgommers for his help. It did require a bit of back and forth with the migration to the new build system, but that's not really required with |
A bit of context:
Since @agriyakhetarpal is ready to do the work, including changes/improvements after initial merge, it seems quite low-risk to merge the job - and if it turns out to be a burden, then you can simply decide to disable the job again. |
That's a fair point. Is there an official Docker image for pyodide? I personally would be comfortable debugging if this was available, not sure about others, though. |
Since other pydata libraries have been testing it, I would be OK adding it to the CI but, if it comes to it, allow it to fail without blocking a merge |
@lithomas1 there is a Pyodide Docker image, yes: https://pyodide.org/en/stable/development/building-from-sources.html#using-docker. I don't use it personally and I prefer debugging on GitHub Actions because they don't have an aarch64 image yet (pyodide/pyodide#1386), and emulation is slow up to multiple magnitudes for me. |
Feature Type
Problem Description
This feature proposes adding out-of-tree Pyodide builds for
pandas
on its own CI. As opposed to in-tree builds in the Pyodide packages repository (see the recipe), this will enable building WASM wheels (which are 32-bit)1 via the Emscripten toolchain on GitHub Actions.In my most recent work assignment, I am working on improving the interoperability for the Scientific Python ecosystem of packages with Pyodide and with each other (Quansight-Labs/czi-scientific-python-mgmt#18), which shall culminate with efforts towards bringing interactive documentation for these packages where they can then be run in JupyterLite notebooks, through nightly builds and wheels for these packages pushed to PyPI-like indices on Anaconda.org (Quansight-Labs/czi-scientific-python-mgmt#19) at and during a later phase during the project.
Feature Description
pandas
in an activated virtual environment created and handled by Pyodide.Alternative Solutions
N/A
Additional Context
I notice that the build and the tests are going to be split into separate packages in #53007, which can be handled (I assume that the imminent
pandas-tests
package will be a pure Python one, which does not need compilation for Pyodide – they are supported by default).cc @lithomas1 for any additional comments here, as required. Thomas and I have been in conversation about this just about a fortnight ago, and we had agreed that reducing the size of the wheels would be helpful for the WebAssembly rendition of
pandas
as well – it is particularly useful where reducing bandwidth usage is in the question. I am happy to follow along on further developments on that regard.Footnotes
It is to be noted that DEPS: drop 32-bit support #15889 removed 32-bit support by removing the wheels, but if I understand correctly,
pandas
continues to be supported when building from source on 32-bit platforms. ↩The text was updated successfully, but these errors were encountered: