Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for async iterators in seq functions #1181

Open
ikappaki opened this issue Dec 15, 2024 · 5 comments
Open

Support for async iterators in seq functions #1181

ikappaki opened this issue Dec 15, 2024 · 5 comments

Comments

@ikappaki
Copy link
Contributor

Hi,

Python provides async for to iterate over async iterators.

The documentation at the yield special form suggests that "...Basilisp seq and sequence functions integrate seamlessly with Python generators", but it seems that doesn't cover async generators.

For example, given the following python file which defines an agen async generator function, and demonstrate its use:
afor.py

import asyncio

async def agen():
    for i in range(5):
        await asyncio.sleep(0.01)
        yield i

async def main():
    async for value in agen():
        print(value)

if __name__ == "__main__":
    asyncio.run(main())

when run, it successfully iterates over agen

> python afor.py
0
1
2
3
4

However, If i try to do the same over the same async generator agen from Basilisp, using a seq (or vec), it fails with an async_generator object is not iterable error:
issuefor.lpy

(import asyncio afor)

(defasync main
  []
  (seq (afor/agen)))

(asyncio/run (main))
> basilisp run issuefor.lpy
Traceback (most recent call last):
...
  File "C:\src\basilisp\src\basilisp\lang\seq.py", line 254, in to_seq
    return _seq_or_nil(sequence(o))
                       ^^^^^^^^^^^
  File "C:\src\basilisp\src\basilisp\lang\seq.py", line 223, in sequence
    i = iter(s)
        ^^^^^^^
TypeError: 'async_generator' object is not iterable

I'm not sure what the right solution is here. Would it be possible to extend the "seq and sequence functions" as described in the documentation to support async iterators?

Thanks

@chrisrink10
Copy link
Member

The problem with Python's async implementation (and why I haven't particularly spent much effort trying to incorporate it into Basilisp) is that functions are "colored" and the specific function color is contagious. Once a function is asynchronous, it can only be used within async contexts (which are the only contexts that can await the results). It is likely that the entirety of Basilisp's functional library would require a parallel implementation in order to work in the way you're requesting.

There are many issues that I see arising, but just to give an example: nearly every Basilisp lazy seq function (e.g. map, filter, etc.) return a LazySeq (as by lazy-seq), which is computed by continuously invoking a new function that (typically) returns another LazySeq containing an element and yet another function. Just switching the underlying mechanism to an async def isn't sufficient because then each call to that function must know to await the results rather than simply calling the function. This isn't to mention that the async iterator implementation also contains a completely parallel interface (e.g. __aiter__ and __anext__), which greatly complicates how various components of the ecosystem interact with these new async seqs.

I personally am ok with adding async primitives to Basilisp so users can access them (and fixing issues such as #1179 and #1180), but I do not want to infect the core seq library with Python async (absent a detailed analysis of the effects across the entire project). To me it seems like what you are asking for should instead be a separate contrib style library in Clojure with variants of alazy-seq, amap, afilter, etc.

@chrisrink10
Copy link
Member

The documentation at the yield special form suggests that "...Basilisp seq and sequence functions integrate seamlessly with Python generators", but it seems that doesn't cover async generators.

Async generators are not generators, so this is not logically inconsistent.

@ikappaki
Copy link
Contributor Author

The problem with Python's async implementation (and why I haven't particularly spent much effort trying to incorporate it into Basilisp) is that functions are "colored" and the specific function color is contagious. Once a function is asynchronous, it can only be used within async contexts (which are the only contexts that can await the results). It is likely that the entirety of Basilisp's functional library would require a parallel implementation in order to work in the way you're requesting.

There are many issues that I see arising, but just to give an example: nearly every Basilisp lazy seq function (e.g. map, filter, etc.) return a LazySeq (as by lazy-seq), which is computed by continuously invoking a new function that (typically) returns another LazySeq containing an element and yet another function. Just switching the underlying mechanism to an async def isn't sufficient because then each call to that function must know to await the results rather than simply calling the function. This isn't to mention that the async iterator implementation also contains a completely parallel interface (e.g. __aiter__ and __anext__), which greatly complicates how various components of the ecosystem interact with these new async seqs.

I personally am ok with adding async primitives to Basilisp so users can access them (and fixing issues such as #1179 and #1180), but I do not want to infect the core seq library with Python async (absent a detailed analysis of the effects across the entire project). To me it seems like what you are asking for should instead be a separate contrib style library in Clojure with variants of alazy-seq, amap, afilter, etc.

Right, thanks for taking the time to explain the situation and highlighting the challenges with async constructs in python. My knowledge of this subject is fairly superficial, but I wanted to bring attention to it as a potential gap. This particular issue caught my attention after reviewing async with.

Creating an external library to address this seems like an excellent idea, provided, as you mentioned, that the necessary primitives are in place. Do you have any initial thoughts on how an async for concept could be supported as a primitive construct? The first approach that comes to mind is something like ^:async for, but I'm not sure what challenges that might involve.

@chrisrink10
Copy link
Member

Creating an external library to address this seems like an excellent idea, provided, as you mentioned, that the necessary primitives are in place. Do you have any initial thoughts on how an async for concept could be supported as a primitive construct? The first approach that comes to mind is something like ^:async for, but I'm not sure what challenges that might involve.

I may be misunderstanding you but I don't think Clojure/Basilisp for is what I would consider a primitive (from the perspective of a Python async). I would not support creating or supporting asynchronous operations directly using the Basilisp for construct. By primitives, I mean the building blocks of asynchronous functions (being able to define async functions and await their results). The library would be responsible for creating its own interfaces and operating on them.

If you're just talking about how to adapt a Python async for into Basilisp, you'd need to probably create a parallel interface to ISeq (like IAsyncSeq) which has async variants of all of the ISeq methods and dunders. Then presumably you'd just coerce any asynchronous generator or sequence into a sequence using something like an aseq function which could then be passed to any other asynchronous seq function.

@ikappaki
Copy link
Contributor Author

Creating an external library to address this seems like an excellent idea, provided, as you mentioned, that the necessary primitives are in place. Do you have any initial thoughts on how an async for concept could be supported as a primitive construct? The first approach that comes to mind is something like ^:async for, but I'm not sure what challenges that might involve.

I may be misunderstanding you but I don't think Clojure/Basilisp for is what I would consider a primitive (from the perspective of a Python async). I would not support creating or supporting asynchronous operations directly using the Basilisp for construct. By primitives, I mean the building blocks of asynchronous functions (being able to define async functions and await their results). The library would be responsible for creating its own interfaces and operating on them.

If you're just talking about how to adapt a Python async for into Basilisp, you'd need to probably create a parallel interface to ISeq (like IAsyncSeq) which has async variants of all of the ISeq methods and dunders. Then presumably you'd just coerce any asynchronous generator or sequence into a sequence using something like an aseq function which could then be passed to any other asynchronous seq function.

My reasoning was as follows: Python has a primitive async for construct (backed by ast.AsynFor), so I though it reasonable to suggest a Basilisp "primitive" function counterpart. A direct translation to to the python AST would likely be the most perforrmant. Since this involves a python for function, the idea of supporting an ^:async metadata tag in Basilisp for initially seemed like an nice syntactic suggestion.

I hope this clarifies a bit my train of thought behind to the suggestion.

To summarize the options discussed

  1. Create a parallel interface to ISeq (like IAsyncSeq) which has async variants of all the ISeq methods and dunders.
  2. Support a Basilisp construct that directly translates to an AsyncFor AST node. Examples options include a new primitive like python/asyncfor , or an^:async for (which you have rightly discounted), or or any other better syntactic construct t to generate an AsyncFor AST node.

Does option 2 seem feasible and something that you might consider? Option 1 feels like overkill to me and seems to me.

here’s an example of how I imagine python/asyncfor could look like building on the original example of this ticket:

(import asyncio afor)

(defasync main
  []
  (python/asyncfor [value (afor/agen)]
                   (println value)))

(asyncio/run (main))

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants