Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add babel-plugin-transform-regex to devDependencies #773

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

slevithan
Copy link
Contributor

@slevithan slevithan commented Aug 9, 2024

Closes #704.

Per previous discussion with @fabian-hiller in #704, this adds package babel-plugin-transform-regex to devDependencies for improved DX.

  • Tests, lint, prettier, and build all pass.
  • Renamed library/src/regex.ts as library/src/regex.src.ts.
    • regex.ts is now the transpiled output.
  • In regex.src.ts, I updated only the first two regexes (BASE64_REGEX and BIC_REGEX) to use the regex tag with more readable syntax (using free spacing, subroutines, a subroutine definition group, and noncapturing (…)).
    • The emitted source in regex.ts for these two regexes is identical to before, which helps demonstrate that the Babel plugin is doing a great job with producing efficient and readable native regexes (since the regexes being replaced were hand-tuned regex literals). As a result, this PR's only change to regex.ts is an added header comment.
    • Other regexes and the rest of the code in regex.src.ts (apart from a hint comment at the top) is untouched.
    • I'll update the remaining regexes (many of which will benefit much more than these two, which are quite simple) after this PR lands.
  • Also added package regex to devDependencies so the types work in regex.src.ts.
    • The import { regex } from 'regex'; line in regex.src.ts is stripped from transpiled source by the Babel plugin (via its removeImport option which removes any import declarations with module name 'regex').
    • To make the types for regex work, I had to update library/tsconfig.json, changing "moduleResolution": "node" to "Bundler". Values "NodeNext" and "Node16" would also work for regex.src.ts, but those values conflict with other, current tsconfig.json settings.
      • This change doesn't seem to cause any issues elsewhere, but I'm not deeply knowledgeable about TS/CJS/bundling so it's worth examining.

@fabian-hiller
Copy link
Owner

Thank you for creating this PR. Since I have less time at the moment and other things have a higher priority, I will try to get back to you next week.

@fabian-hiller fabian-hiller added the enhancement New feature or request label Aug 10, 2024
@fabian-hiller fabian-hiller self-assigned this Aug 10, 2024
@fabian-hiller
Copy link
Owner

Is it not possible to do the transformation as part of our build step? I don't like that we need a regex.ts and regex.src.ts.

@slevithan
Copy link
Contributor Author

slevithan commented Aug 17, 2024

No problem. I've updated the PR with a solution that generates a transpiled/regex.ts file to use for the build, and then deletes it afterward in a postbuild script (which I added the rimraf package to do).

tsup (and esbuild, which it uses) doesn't have a built-in way to alias local file import paths, so I created a tiny build plugin in tsup.config.ts to do this.


If you prefer to avoid the build plugin, an alternative to the current (new) approach could be:

  1. Copy (or rename) regex.ts to a temp file (lets say regex.orig.ts).
  2. Overwrite regex.ts using Babel.
  3. Run the build with tsup.
  4. Replace regex.ts with regex.orig.ts.

But my initial thought is that moving files around like that would not be preferable.

@fabian-hiller
Copy link
Owner

Thanks for the update! I am pretty busy at the moment and mainly focused on getting our v1 RC and after that our v1 release ready. I will come back to this PR when I have more time again or this feature becomes more important.

@slevithan slevithan force-pushed the regex_babel branch 2 times, most recently from 442aa61 to 96bb4be Compare August 18, 2024 18:11
@slevithan
Copy link
Contributor Author

No problem, take your time. If you have a moment, though, I'm curious whether you have concerns with the current approach (or need time to think about it), or if you think the current PR is on the right track to pick back up from when you have more time.

library/package.json Outdated Show resolved Hide resolved
@fabian-hiller
Copy link
Owner

In general, I don't like to complicate the setup because I like things simple, but of course I see the benefits of this PR.

Is it possible to run our tests with the compiled output? For security reasons this seems to be necessary in the long run.

@slevithan
Copy link
Contributor Author

In general, I don't like to complicate the setup because I like things simple, but of course I see the benefits of this PR.

I share your concern. This was actually the primary reason for my initial approach of having a separate "source" file, with regex.ts being the transpiled output. But I see the downside of that as well.

I've simplified things somewhat in the latest update to the PR, including getting rid of rimraf, and narrowing the aliasing plugin (which can always be generalized later if for whatever reason more paths need to be aliased).

Is it possible to run our tests with the compiled output? For security reasons this seems to be necessary in the long run.

I've updated the PR to do so. Vite/Rollup share the resolve.alias option, so this was easy to do. Their local aliasing support seems very robust (and allows regexes or functions, which I didn't use here).

@fabian-hiller
Copy link
Owner

Thank you for the changes! This PR is now going in a good direction.

@slevithan
Copy link
Contributor Author

slevithan commented Aug 19, 2024

Glad to hear it! No worries if it takes you some time to come back to this, but in the meantime, I've made a couple more tweaks and I've gone ahead and updated the PR to cover the first 12 of 24 regexes in regex.ts. Hopefully this shows more substantively what the end result will be like (I think it makes sense to still wait on rewriting the remaining regexes until after this PR lands). Of the first 12 regexes, a couple (CUID2_REGEX and DECIMAL_REGEX) were so simple that they didn't benefit from regex, so I left them alone. And EMOJI_REGEX was generated by an external library, so I also left it alone so it can be updated more easily.

I also included the following question/comment above IP_REGEX:

// Q: Would it be tree-shakeable if this used interpolation without adding any new variables, as
// ``new RegExp(`^(?:${IPV4_REGEX.source.slice(1, -1)}|${IPV6_REGEX.source.slice(1, -1)})$`, 'iu')``?

If the answer is yes, that would remove a lot of redundancy, since this is the biggest regex in the file. That redundancy is much easier to verify now that I've rewritten the regexes in a readable way.

Important: I've been very careful to ensure that none of the refactored regexes changed what they match (existing bugs in IPV6_REGEX and IP_REGEX have been preserved). I've also avoided adding any capturing groups, since that would change the shape of match results.

@slevithan
Copy link
Contributor Author

I've rebased on top of recent conflicting Valibot configuration changes, and additionally bumped regex to v4.1.3. The latest version of regex avoids the need to update Valibot's library/tsconfig.json to use "moduleResolution": "Bundler". So I've undone that change.

@fabian-hiller fabian-hiller force-pushed the main branch 4 times, most recently from 4666cf9 to 41ae798 Compare September 1, 2024 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve DX with babel-plugin-transform-regex
2 participants