Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: Make test_sql.py parallelizable #60378

Open
1 of 3 tasks
WillAyd opened this issue Nov 20, 2024 · 8 comments · May be fixed by #60595
Open
1 of 3 tasks

TST: Make test_sql.py parallelizable #60378

WillAyd opened this issue Nov 20, 2024 · 8 comments · May be fixed by #60595
Assignees
Labels
good first issue IO SQL to_sql, read_sql, read_sql_query Testing pandas testing functions or related to the test suite

Comments

@WillAyd
Copy link
Member

WillAyd commented Nov 20, 2024

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

test_sql.py must be run on a single thread now, because tests re-use the same table names. This can cause a race condition when different parametrizations of a test run on different threads

Feature Description

Add a uuid or something else to the table names in the test_sql.py module to disambiguate

Alternative Solutions

status quo

Additional Context

No response

@WillAyd WillAyd added Testing pandas testing functions or related to the test suite IO SQL to_sql, read_sql, read_sql_query good first issue labels Nov 20, 2024
@UmbertoFasci
Copy link
Contributor

take

@WillAyd WillAyd changed the title TST: Make test_sql.py serializable TST: Make test_sql.py parallelizable Nov 20, 2024
@UmbertoFasci
Copy link
Contributor

@WillAyd I am about halfway through the tests. I am generating a unique table uuid when indicated while maintaining the original context through the prefix.

Before:

@pytest.mark.parametrize("conn", all_connectable)
def test_read_table_columns(conn, request, test_frame1):
    # test columns argument in read_table
    conn_name = conn
    if conn_name == "sqlite_buildin":
        request.applymarker(pytest.mark.xfail(reason="Not Implemented"))

    conn = request.getfixturevalue(conn)
    sql.to_sql(test_frame1, "test_frame", conn)

    cols = ["A", "B"]

    result = sql.read_sql_table("test_frame", conn, columns=cols)
    assert result.columns.tolist() == cols

After made parallelizable:

@pytest.mark.parametrize("conn", all_connectable)
def test_read_table_columns(conn, request, test_frame1):
    # test columns argument in read_table
    conn_name = conn
    if conn_name == "sqlite_buildin":
        request.applymarker(pytest.mark.xfail(reason="Not Implemented"))

    conn = request.getfixturevalue(conn)
    table_uuid = f"test_frame_{uuid.uuid4().hex}"
    sql.to_sql(test_frame1, table_uuid, conn)

    cols = ["A", "B"]

    result = sql.read_sql_table(table_uuid, conn, columns=cols)
    assert result.columns.tolist() == cols

Let me know if you would like this done in a different fashion.

@WillAyd
Copy link
Member Author

WillAyd commented Nov 21, 2024

Seems reasonable. Probably worth a helper function in the module to not have to repeat the same code in each function, but what you have looks like its headed in the right direction

@arashgodgiven
Copy link

Hi, noticed this issue is being worked on. Is there any way I can assist or take on parts of the work to contribute? I'm interested in helping out.

@UmbertoFasci
Copy link
Contributor

Hi @arashgodgiven I am fairly close to wrapping this up. I am just handling the iris and types connectables now. Keep an eye out though, thanks for asking!

@narutonamikaze
Copy link

hey ! i think the issue still hasn't been resolved, and would like to work upon it immediately !

@UmbertoFasci
Copy link
Contributor

All that is remaining is handling the test cases that share a database state through a couple of fixtures which is only a couple of connectable sets. Submitting PR this coming Friday.

@UmbertoFasci UmbertoFasci linked a pull request Dec 20, 2024 that will close this issue
2 tasks
@UmbertoFasci
Copy link
Contributor

@WillAyd I have gone ahead and submitted a PR for this issue, however I believe there is an issue with the current reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue IO SQL to_sql, read_sql, read_sql_query Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants