-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate sqlite output from RakuDoc source input #359
Comments
The first step - I suggest - would be to create an |
Thinking big, this could also be a good starting point for a |
@patrickbkr Lets see how the first step goes, but if routines works well, then we can put all the data from the Search function into an sql next. And then locally maybe have a Cro app running so that sql return data in the form of http links to individual pages are served locally. |
This full text search module is available: https://sqlite.org/fts5.html I've never used it, myself. |
- add sqlite-db plugin - plugin accesses the dataset created for the Routines page in the website - the data is formatted into the rows of an sqlite table - the completed sqlite database is output into a filename specified in the `configs/03-plugin-options.raku` config file, so changing the value of the `db-filename` field of the `sqlite-db` sub-hash changes the output file name. - the sqlite file is moved by Collection to a directory defined by 'database-dir' relative to the directory in which Collection runs. As set up here, the directory sqlite_dir will be at the same level as the existing renedered_html
* First step to issue #359, no change to the HTML output - add sqlite-db plugin - plugin accesses the dataset created for the Routines page in the website - the data is formatted into the rows of an sqlite table - the completed sqlite database is output into a filename specified in the `configs/03-plugin-options.raku` config file, so changing the value of the `db-filename` field of the `sqlite-db` sub-hash changes the output file name. - the sqlite file is moved by Collection to a directory defined by 'database-dir' relative to the directory in which Collection runs. As set up here, the directory sqlite_dir will be at the same level as the existing renedered_html * Separate out schema from data - data is now in filename specified in plugin config
If - during the build - we can identify hyperlinks (whether relative or external), it would be useful to toss them into a That's just one use case. There are many. I'd like to start re-architecting the build to pivot around a normalized sqlite database. Here is a diagram. If we pull this off, the advantage will be that we can decouple each of the downstream processes that depend on the parsed data (A, B, C, D in the diagram). SQLite supports multiple processes reading simultaneously, so we can run them in parallel to speed things up significantly. I also think it will be easier for contributors (but I could be wrong, who knows). The challenge is that the "DB Creation script", the build process entry point, becomes more complex. Effectively this component is a compiler from rakudoc to SQL statements.
In principle, if we capture and normalize all the rakudoc source material, we can drive any downstream use case: static websites, man pages, tests, offline search. |
@dontlaugh This is a very significant change. Honestly, I haven't thought of the build process in this way, and it will take me a while to think through how to do this. I think your |
It sounds like it. I imagine that the complex program on the left hand side would be best implemented by a library that works with a proper AST. The data we need in normalized form will look different than an in-memory AST, but as long as each AST node can be serialized as text or bytes, we can store both in different database tables.
I am pleased to hear that. What library is parsing the rakudoc into an AST? Can you link to it? Even if it is in nascent stages.
I agree. This is in the high-effort, but (potentially) high-reward category. |
@dontlaugh the new Rakudo compiler creates AST for all programs and the AST can be manipulated. Although the Rakudo AST compiler has not yet completely landed - this will be the raku.e milestone - there is sufficient for RakuDoc, and that has been backported into raku.d. As an example, if you take a recent version of raku, eg 2024.04, and run the following in a terminal assuming the current directory is a local clone of
The new bit is the |
Saw this in Rakudo Weekly: The |
This issue is spawned from #75
The objective here is to create a schema and populate it with the data attributes parsed from https://github.com/Raku/doc
The resulting sqlite database could be used to power relational queries, such as all classes that implement a role, or listing all methods on a class and whether they come from roles, etc.
A sqlite database can also be a potential intermediary format for building the website itself. E.g. the middle part of the Collection framework's pipeline could be simplified or extended if we can consistently parse our pod6 files into a relational database.
Somewhat related issues:
The text was updated successfully, but these errors were encountered: