You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# This passes Mock test, but fails when used in Trino/Iceberg environment
def test_trino_pandas_insert():
import pandas as pd
# mock up an sqlalchemy table
tbl = mock.MagicMock()
tbl.name = "test_pandas"
# mock up an sqlalchemy Connnection
cxn = mock.MagicMock()
df = pd.DataFrame(
{"A": [4.5], "B'C": [math.nan], None: [math.inf], "D": [-math.inf], "E": [datetime(2022, 1, 1)], ":F": [1.0]}
).convert_dtypes()
assert (df.dtypes == ['Float64', 'Int64', 'Float64', 'Float64', 'datetime64[ns]', 'Int64']).all()
df.to_sql(
tbl.name,
con=cxn,
schema="test",
if_exists="append",
index=False,
method=TrinoBatchInsert(batch_size=5, verbose=True),
)
The failure is that datetime64[ns] is translated to timestamp(3) by Pandas, but Iceberg wants timestamp(6). I have tried to figure out the correct dtype argument to pass to to_sql, but everything I've tried passing so far resolves to timestamp(3) inside of Pandas. I need to figure out how to pass it something that will resolve to timestamp(6).
The text was updated successfully, but these errors were encountered:
The failure is that datetime64[ns] is translated to timestamp(3) by Pandas, but Iceberg wants timestamp(6). I have tried to figure out the correct dtype argument to pass to to_sql, but everything I've tried passing so far resolves to timestamp(3) inside of Pandas. I need to figure out how to pass it something that will resolve to timestamp(6).
The text was updated successfully, but these errors were encountered: