Sıra | DOSYA ADI | Format | Bağlantı |
---|---|---|---|
01. | Dataset Group Group Optimized | pptx | Sunumu İndir |
Transkript
Welcome!Power BI User Group (PUG)New York
Power BI Dataset Design Brett PowellBI Consultant, Author
Objectives1) Definition of Datasets 2) Dataset Design Process 3) Import vs. DirectQuery Factors4) Managed Data Access Layer5) Relationship Types and Patterns6) Usability Tips and Examples 7) Performance and Scalability Tips3
Session Agenda• The Role of Datasets in Power BI• Planning Scalable Datasets• End-to-End Dataset Development • Data Source and Access Layers • Data Model and Metadata• DAX Measures and Client Interface• Tips and Examples:• Usability• Performance and Scalability • Analysis4
Meet Your Presenter• Boston BI User Group (PUG) Leader• BI Consultant, Frontline Analytics• Author of Power BI Book(s)• Blogger, Insight Quest• Sites:• http://insightsquest.com• http://frontlineanalytics.net• Contact:• Email: Brett.Powell@FrontlineAnalytics.net• Twitter: @BrettPowell76• LinkedIn5
Datasets in Power BI
• Analytical Data Models (not Reports)• Platforms for Reporting & Analysis• User Interface for Client Tools• Embedded Business Logic• Reports connect to Datasets• Dataset Layers:1. Data Access and Transform (M Query)2. Data Model (Import or DirectQuery)3. DAX MeasuresPower BI Datasets DefinedCalculations• DAX MeasuresData Model• Hierarchies, Metadata• Relationships• Tables, Columns (Types)Queries• Tables, Columns (Types)• Transformations (Optional)• Data Source(s)Three Layers of PBI Datasets7
• Intuitive User Interface• Version Control; Reusability• Data Security • Query Performance • Scalability• Analytics• Availability •ManageabilityDataset Design ObjectivesPower BI Publisher for ExcelPBIX Report: Live ConnectionAssigning Users/Groups to Row Level Security Roles8
Dataset Designers in Power BI TeamsDataset Designers• Collaborate with:• Data Source Owners• Report Authors• Power BI Admin(s)• Data Access• Privacy Levels• M (or SQL)• Authentication• Data Refresh • Data Model• DAX Measures• Security Roles• MetadataReport Authors• Collaborate with:• Business Users• Dataset Designers• Reports and Dashboards• Design Standards• Interactivity• Mobile Experience• Mobile Optimized• Content Distribution• Apps• Subscriptions• Support Self-Service• Analyze in ExcelPower BI Admin(s)• Collaborate with:• O365 Global Admin• Governance Team• BI Team• Tenant Settings• Security Groups• Premium Capacity• Capacity Allocation• Power BI Licenses• On-Premises Gateway• Usage Monitoring• Resource Monitoring• Organizational Policies9
Planning Datasets
Four Step Dataset Design Process1. Identify the Business Process2. Declare the Grain3. Identify the Dimensions4. Define the FactsExpand and iterate on models:• New metrics and attributes • New fact and dimension tables• Revised Definitions• Performance TuningData Warehouse BUS Matrix11
Data SourcesData WarehousePower BI DesktopDataset Planning• Where is the data? • Cloud, On-Premises, Both• Databases, Files, Both• How much integration is required?• Where can/should it be implemented?• Are we targeting DirectQuery?• HTAP (Hybrid Transactional Analytical)• Implications for Data Source• Implications for DatasetFact and Dimension TablesSalesGeneral LedgerInventorySQL ViewsM QueriesDAX Calculated Tables and ColumnsDAX MeasuresETL/ELT Data Transformation Process1234561. Integrate and transform before PBI Dataset if possible (1-3)2. If not possible, write efficient M queries (4); DAX as last option (5-6) Plan12
DirectQuery Factors1. DirectQuery Data Sources• Single source and database2. Scalability of Import Models• Size limit; Full Refresh• Analysis Services RAM; Migration3. How complex is the model?• Data Access and Analytics4. Available Resources• DW or DB Relational Database Skills• Features and Hardware of Source DBPower BI DirectQuery Data SourcesAmazon Redshift Azure HDInsight Spark (Beta)Azure SQL DatabaseIBM Netezza (Beta)Impala (v 2.x) Azure SQL Data WarehouseOracle (v12+) SAP BW (Beta) SAP HANASnowflake Spark (v.9+) (Beta) SQL ServerTeradataDirectQuery Supported Data Sources as of 8/31 13
Scaling Power BI Datasets• Power BI Premium at GA• Isolation (Noisy Neighbor)• 48X (vs 8X) Refreshes/Day• Remove 10GB/User Limit• Premium Roadmap• Large Datasets (1GB Current)• Incremental Data Refresh • Scale Out, Pin to Memory• Analysis Services • Migrate PBIX to SSAS Model• Azure (AAS) or On-Premises• Partitions, Perspectives, Display Folders• Source Control Integration• Automation, AdministrationCapacity Nodes Backend Cores Memory Direct or Live QueriesP1 4 25 GB 30/secP2 8 50 GB 60/secP3 16 100 GB 120/secPower BI Premium Capacities at GA (excluding EM)Azure Analysis Services Standard Tier Instances (as of 9/5/17)Instance QPUs MemoryS0 40 10 GBS1 100 25 GBS2 200 50 GBS4 400 100 GBS8 320 200 GBS9 640 400 GB14
Dataset Development
Data Warehouse Schema and Views•Dimension and Fact tables•Referential Integrity•Robust Date Table• Date Intelligence Columns• Slowly Changing Dimensions• Historical Tracking• SQL Views for Dataset• 1:1 with Tables in PBIXDatabase Diagram: SQL Server16
Power BI Data Access: M Queries•Data Source Parameters• Prod vs Dev • Staging Queries• Refresh Only•Query Groups• Facts and Dims•Parameter Tables• Load OnlyM Queries17
Power BI Data Models•Columns•Data Types• Sort By•Properties•Tables•Relationships•Hierarchies•Measures• FormattingColumn Properties and MetadataRelationships View18
Relationship Pattern: Actual vs. Plan1. Bridge tables2. Relationships• Bidirectional: Dim to Bridge• Single: Bridge to Plan3. Actual to Plan Measures:• Check Filter Context• Compare Actual to PlanActual vs. Plan ModelingReturn Blank if Filter Context Not Supported by Plan 19
DAX Measures and Client Interface1. Compare and Combine Facts• Actuals versus Budget• Online Sales Plus Store Sales2. Dynamic Date Columns• Report, Page, and Visual Filters3. Date Intelligence• Compare Common Periods4. Target Measures• KPI and Gauge Goals/Targets20
Bonus: Dataset Documentation
Power BI Documentation Dataset and Report1. Identify SSAS Server and Database of PBIX via DAX Studio2. Query SSAS DMVs via Analysis Services Connector (w/Params)3. Integrate DMVs to form documentation tablesRetrieving SSAS DMV Metadata into PBIX Sample Documentation Report Visual 22
Performance and Scalability
Segment 31 M RowsSegment 21 M RowsSegment 11 M RowsImport Dataset Performance and Scale Tips• Import Mode• High Cardinality Columns• High Cardinality Relationships • Optimal Sort Order• DAX Optimization• Leverage the Storage Engine• Early, efficient filters• Limit iterations and complex row operations2015201620171.51.81.923235.43.8Date Price Qty SalesImport Mode Columnar Storage24Order #1234123512361. Can fact table columns be removed?2. Can DAX Measures eliminate the need for columns?3. What is the optimal sort order?4. Are there any calculated columns that can be replaced?
DirectQuery Performance Test and Tips• Optimize Data Source• Columnstore Index • Referential Integrity• Efficient SQL Views• Optimized DAX Measures• Use Storage Engine Functions• See DAX Formula Compatibility in DQ ModeFact Table Duration CPU ReadsClustered Columnstore with Ref Integrity 1.3 sec 2.3 sec 54KPage Compressionwith Ref Integrity 2.9 sec 4.3 sec 178KPage Compression without Ref Integrity 3.4 sec 4.9 sec 178KTest Conditions• SQL Server 2016• 4 CPU Cores, MAXDOP = 2• 12M Row Fact Tables• Common Report Query• Sales by Month for a yearDirectQuery Performance Test ModelDirectQuery Performance Test Results25
Capture and Analyze Power BI Queries1. Retrieve DAX Query from PBIX• SQL Server Profiler2. Analyze Query in DAX Studio• Import or DirectQuery3. Retrieve SQL Statement of DirectQuery Model Profiler Trace of Power BI Desktop DatasetAnalyze DAX Query with DAX Studio (Import Model)26
Thank you for Attending!Don’t forget to join your local PUG to enjoy year-round networking and learning. www.pbiusergroup.com/NYNJwww.pbiusergroup.com/philly