Skip to main content

4 posts tagged with "spark"

View All Tags

Spice v1.0-rc.5 (Jan 13, 2025)

· 7 min read
Evgenii Khramkov
Senior Software Engineer at Spice AI

Announcing the release of Spice v1.0-rc.5 🛠️

Spice v1.0.0-rc.5 is the fifth release candidate for the first major version of Spice.ai OSS. This release focuses production readiness and critical bug fixes. In addition, a new DynamoDB data connector has been added along with automatic detection for GPU acceleration when running Spice using the CLI.

Highlights in v1.0-rc.5

  • Automatic GPU Acceleration Detection: Automatically detect and utilize GPU acceleration when running by CLI. Install AI components locally using the CLI command spice install ai. Currently supports NVIdia CUDA and Apple Metal (M-series).

  • DynamoDB Data Connector: Query AWS DynamoDB tables using SQL with the new DynamoDB Data Connector.

datasets:
- from: dynamodb:users
name: users
params:
dynamodb_aws_region: us-west-2
dynamodb_aws_access_key_id: ${secrets:aws_access_key_id}
dynamodb_aws_secret_access_key: ${secrets:aws_secret_access_key}
acceleration:
enabled: true
sql> describe users;
+----------------+-----------+-------------+
| column_name | data_type | is_nullable |
+----------------+-----------+-------------+
| created_at | Utf8 | YES |
| date_of_birth | Utf8 | YES |
| email | Utf8 | YES |
| account_status | Utf8 | YES |
| updated_at | Utf8 | YES |
| full_name | Utf8 | YES |
| ... |
+----------------+-----------+-------------+
  • File Data Connector: Graduated to Stable.

  • Dremio Data Connector: Graduated to Release Candidate (RC).

  • Spice.ai, Spark, and Snowflake Data Connectors: Graduated to Beta.

Spice v0.12-alpha (April 29, 2024)

· 4 min read
Luke Kim
Founder and CEO of Spice AI

The v0.12-alpha release introduces Clickhouse and Apache Spark data connectors, adds support for limiting refresh data periods for temporal datasets, and includes upgraded Spice Client SDKs compatible with Spice OSS.

Highlights

  • Clickhouse data connector: Use Clickhouse as a data source with the clickhouse: scheme.

  • Apache Spark Connect data connector: Use Apache Spark Connect connections as a data source using the spark: scheme.

  • Refresh data window: Limit accelerated dataset data refreshes to the specified window, as a duration from now configuration setting, for faster and more efficient refreshes.

  • ODBC data connector: Use ODBC connections as a data source using the odbc: scheme. The ODBC data connector is currently optional and not included in default builds. It can be conditionally compiled using the odbc cargo feature when building from source.

  • Spice Client SDK Support: The official Spice SDKs have been upgraded with support for Spice OSS.

Spice v0.11.1-alpha (April 22, 2024)

· 3 min read
Luke Kim
Founder and CEO of Spice AI

The v0.11.1-alpha release introduces retention policies for accelerated datasets, native Windows installation support, and integration of catalog and schema settings for the Databricks Spark connector. Several bugs have also been fixed for improved stability.

Highlights

  • Retention Policies for Accelerated Datasets: Automatic eviction of data from accelerated time-series datasets when a specified temporal column exceeds the retention period, optimizing resource utilization.

  • Windows Installation Support: Native Windows installation support, including upgrades.

  • Databricks Spark Connect Catalog and Schema Settings: Improved translation between DataFusion and Spark, providing better Spark Catalog support.

Spice v0.11-alpha (April 15, 2024)

· 4 min read
Sergei Grebnov
Senior Software Engineer at Spice AI

The Spice v0.11-alpha release significantly improves the Databricks data connector with Databricks Connect (Spark Connect) support, adds the DuckDB data connector, and adds the AWS Secrets Manager secret store. In addition, enhanced control over accelerated dataset refreshes, improved SSL security for MySQL and PostgreSQL connections, and overall stability improvements have been added.

Highlights in v0.11-alpha

DuckDB data connector: Use DuckDB databases or connections as a data source.

AWS Secrets Manager Secret Store: Use AWS Secrets Managers as a secret store.

Custom Refresh SQL: Specify a custom SQL query for dataset refresh using refresh_sql.

Dataset Refresh API: Trigger a dataset refresh using the new CLI command spice refresh or via API.

Expanded SSL support for Postgres: SSL mode now supports disable, require, prefer, verify-ca, verify-full options with the default mode changed to require. Added pg_sslrootcert parameter for setting a custom root certificate and the pg_insecure parameter is no longer supported.

Databricks Connect: Choose between using Spark Connect or Delta Lake when using the Databricks data connector for improved performance.

Improved SSL support for Postgres: ssl mode now supports disable, require, prefer, verify-ca, verify-full options with default mode changed to require. Added pg_sslrootcert parameter to allow setting custom root cert for postgres connector, pg_insecure parameter is no longer supported as redundant.

Internal architecture refactor: The internal architecture of spiced was refactored to simplify the creation data components and to improve alignment with DataFusion concepts.