Uploading your first Dataset
In this example we will examine how upload your first dataset into the platform.
Step 1: Connecting into the platform
As a first step you will need to connect into the platform using your credentials.
from typing import Final
import evoml_client as ec
import pandas as pd
# Pease replace with your deploy-platform URL
API_URL: Final[str] = "https://evoml.ai"
# Please replace with your username
USERNAME: Final[str] = ""
# Please replace with your password
PASSWORD: Final[str] = ""
# Connect to the platform
ec.init(base_url=API_URL, username=USERNAME, password=PASSWORD)
Step 2: Create and Load your dataset
Evoml Client library offers a variety of ways on how to create a new Dataset. For example:
- By reading from a file using
ec.Dataset.from_file()
method - By loading a numpy array using
ec.Dataset.from_numpy()
method - By loading a pandas dataframe using
ec.Dataset.from_pandas()
method - By ingesting a file from a URL using
ec.Dataset.from_url()
method
*Supported files include: csv
, parquet
, avro
, feather
and hdf5
Now Let's create a small pandas dataframe!
import pandas as pd
import numpy as np
# Create random data for 100 rows and 10 columns
data = np.random.rand(100, 10)
# Define column names
columns = [f"Column_{i+1}" for i in range(10)]
# Create the DataFrame
df = pd.DataFrame(data, columns=columns)
We can use the ec.Dataframe.from_pandas
to upload the dataframe into the platform.
dataset = ec.Dataset.from_pandas(df, name="Random Data")
dataset.put()
print(f"Dataset ID: {dataset.dataset_id}")
The dataset is now successfully updated into the platform. You should be able to view the dataset be navigating to the EvoML Platform through your browser.