Performance and Cost Optimization Concepts
Explain the use of the Query Profile
The Query Profile in Snowsight shows how Snowflake executed a query. Access it from Query History to analyze performance.
Explain plans
The query profile displays the execution plan as a graphical tree of operators (table scans, joins, aggregations). Each node shows execution time, helping identify bottlenecks.
Query Profile: Visual execution plan. Nodes show operators and timing—use to find bottlenecks.
Data spilling
Spilling occurs when a warehouse lacks memory for intermediate results. Data spills first to local SSD, then to remote storage. Disk I/O is slower than memory, so spilling degrades performance.
Data Spilling: Memory overflow to disk. Local SSD first, then remote storage. Indicates warehouse may be too small.
Use of the data cache
The profile shows cache efficiency—percentage of data read from the warehouse’s SSD cache versus remote storage. Higher cache hit rates mean better performance.
Micro-partition pruning
The profile shows partition pruning effectiveness: total partitions versus partitions scanned. High scan ratios suggest adding a clustering key.
Partition Pruning: Profile shows total vs scanned partitions. Poor pruning = consider clustering key.
Query history
Query History in Snowsight lists all executed queries. Filter and search to find specific queries, then open their Query Profile.
Explain virtual warehouse configurations
Types of warehouses
- Standard Warehouse: Default type for general query workloads.
- Snowpark-optimized Warehouse: Larger memory for Snowpark workloads with memory-intensive operations.
Warehouse Types: Standard (general purpose) vs Snowpark-optimized (larger memory for Snowpark).
Multi-clustering warehouses
Multi-cluster warehouses (Enterprise+) automatically scale compute for high concurrency.
Scaling policies
- Standard: Starts new clusters quickly to handle load. Default policy, best for unpredictable workloads.
- Economy: Maximizes existing cluster utilization before starting new ones. Saves credits but may queue queries.
Scaling Policies: Standard (fast scale-out, may over-provision) vs Economy (maximize utilization, may queue).
Scaling modes
- Maximized: Starts all clusters up to the maximum immediately.
- Auto-scale: Default. Starts/stops clusters based on query load.
Warehouse sizing
Sizes range from X-Small to 6X-Large. Larger = more CPU and memory. Start small and scale up if you see data spilling or slow complex queries.
Sizing: Start small, scale up if spilling occurs. Each size up doubles compute and cost.
Warehouse settings and access
AUTO_SUSPEND: Suspends warehouse after inactivity period (seconds). Critical for cost control.AUTO_RESUME: Automatically resumes when a query arrives.- Access controlled by
USAGEprivilege granted to roles.
Key Settings: AUTO_SUSPEND (cost control), AUTO_RESUME (convenience). Grant USAGE to roles for access.
Outline virtual warehouse performance tools
Monitoring warehouse loads
Snowsight dashboards show running queries, queued queries, and utilization. Use to determine if warehouse is properly sized.
Scaling up compared to scaling out
- Scale Up: Increase warehouse size. Use for complex, long-running queries needing more memory/CPU.
- Scale Out: Add clusters (multi-cluster). Use for high concurrent query volume.
Scale Up vs Out: Up = bigger warehouse for complex queries. Out = more clusters for concurrency.
Query acceleration service
QAS offloads portions of large ad-hoc queries to serverless compute. Boosts performance for occasional large queries without running a larger warehouse full-time.
Query Acceleration Service: Serverless compute boost for large ad-hoc queries. Cost-effective alternative to larger warehouse.
Optimize query performance
Describe the use of materialized views
Materialized views store pre-computed query results. Queries against the view read stored results instead of re-executing. Snowflake automatically maintains the view as base data changes.
Materialized Views: Pre-computed, stored results. Auto-maintained. Faster reads, but has storage and maintenance costs.
Use of specific SELECT commands
Select only needed columns instead of SELECT *. Reduces data read from storage and network transfer.
Clustering
Clustering keys co-locate related data in micro-partitions. Improves pruning for queries filtering on clustered columns.
Clustering: Define clustering key on large tables. Improves pruning for filtered queries.
Search optimization service
Builds a search access path for fast point-lookup queries (returning few rows). Useful for dashboards and interactive apps on large tables.
Search Optimization: For point-lookups on large tables. Fast single-row or few-row queries.
Persisted query results
See Result cache below.
Understanding the impact of different types of caching
Three cache layers:
Metadata cache
Cloud Services layer caches table metadata (partition count, column value ranges). Enables fast partition pruning without scanning.
Result cache
Stores query results for 24 hours. Identical queries with unchanged data return cached results instantly—no compute cost.
Result Cache: 24-hour cache of query results. Same query + unchanged data = instant, free results.
Warehouse cache
SSD cache on warehouse nodes stores recently accessed table data. Faster than remote storage reads.
Three Caches: Metadata (pruning info), Result (24hr query results), Warehouse (SSD table data).
Describe cost optimization concepts and best practices in Snowflake
Understanding and exploring the costs of different Snowflake features and services
Cost insights feature in Snowsight
Built-in dashboards show costs by compute, storage, and serverless features. Drill down to individual warehouses and services.
Use of different table types and sizes
Use TRANSIENT tables for temporary/staging data—no Fail-safe period, lower storage cost.
Use of views
Standard views have no storage cost. Materialized views incur storage and compute costs for maintenance.
Use of search optimization paths
Search Optimization has storage cost (search access path) and compute cost (building/maintaining).
Storage costs
Charged per terabyte per month.
Compute costs
Charged in credits per second with 60-second minimum.
Cost Components: Storage (TB/month), Compute (credits/second, 60s min), Serverless (feature-specific).
Understand and explore cloud services costs in Snowflake
Cloud Services usage is free up to 10% of daily compute credits. Overage is billed.
Cloud Services: Free up to 10% of daily compute. Overage billed.
Costs considerations when using serverless features
Snowpipe, Automatic Clustering, Search Optimization run on Snowflake-managed serverless compute with separate pricing.
Cost considerations when moving data among regions
Replication
Storage cost in secondary account plus data transfer cost between regions.
Fail-over
Compute costs in secondary account when it becomes primary.
Monitor and control costs
Resource monitors
Track credit usage for warehouses. Configure alerts or auto-suspend at thresholds.
Resource Monitors: Track warehouse credits. Set alerts and auto-suspend thresholds.
Snowflake Budgets service
Set monthly spending budgets and track actual spend against them.
Attribute costs
Cost center tagging
Tag warehouses and objects with department names. Use tags in ACCOUNT_USAGE views for cost attribution.
Use of the ACCOUNT USAGE schema
Query views like WAREHOUSE_METERING_HISTORY, STORAGE_USAGE, SERVERLESS_TASK_HISTORY for detailed cost analysis.
Cost Attribution: Tag objects, query ACCOUNT_USAGE views (WAREHOUSE_METERING_HISTORY, STORAGE_USAGE) for analysis.