How to Combine Files in Power BI Using Power Query
You can combine files in Power BI by using the 'Get Data' > 'Folder' option and the 'Combine & Transform Data' feature in Power Query. This approach saves you significant time and effort. For example, loading one million rows with Fast Data Load enabled can be eight seconds faster, and optimizing merges can reduce refresh times from over 30 minutes to just 4 minutes. When you add new files to your folder, Power Query picks them up automatically, giving you clean, up-to-date data with minimal manual work.
Key Takeaways
Connect Power Query to a folder to combine all files inside automatically, saving time and effort.
Choose a sample file with the most complete structure to avoid missing columns in your combined data.
Apply consistent transformations to all files to keep your data clean and reliable every time you refresh.
Filter out unwanted files early to speed up processing and prevent errors during combination.
Automate refreshes so Power Query updates your data with new files without manual work.
Combine Files: Steps
Connect to Folder
To start, you need to connect Power Query to the folder containing your files. This step ensures that Power Query can scan and list all files available for combination.
Open Power BI Desktop and select Get Data.
Choose the Folder connector from the list.
Enter the full path to your target folder.
If you use Power Query Online, specify the on-premises data gateway if required.
Select the appropriate authentication method and provide credentials if prompted.
Tip: Connecting to a folder, rather than individual files, allows you to combine files efficiently and automate future updates. When you add new files to the folder, Power Query includes them automatically during the next refresh.
Combine & Transform Data
Once you connect to the folder, Power Query displays a table listing all files within it. You can preview file names, extensions, and other metadata. At this stage, you have two main options:
Combine & Load: Quickly combine files and load the data directly into Power BI.
Combine & Transform Data: Open the Power Query Editor to clean and shape your data before loading.
Select Combine & Transform Data to access advanced transformation features. This approach gives you full control over the data cleaning process, ensuring that you only load high-quality, usable data.
Choose Sample File
Power Query prompts you to select a sample file. This file acts as a template for the combination process. You should always pick the file with the most complete structure—one that contains all expected columns and headers.
Note: If you select a sample file missing certain columns, Power Query may exclude those columns from the final combined dataset. Choosing the right sample file helps prevent missing data and ensures a smooth combination process.
Apply Transformations
After selecting your sample file, Power Query automatically generates a custom function. This function applies the same set of transformations to every file in the folder. You can now perform essential data cleaning steps, such as:
Removing unnecessary rows (like extra headers or footers)
Promoting the correct row to column headers
Setting accurate data types for each column
Applying these transformations at the function level guarantees consistency across all files. Uniform transformations help maintain data quality and prevent discrepancies. Standardizing formats, naming conventions, and data structures reduces complexity and supports accurate analysis. When you combine files using this method, you benefit from reliable and consistent data every time you refresh.
Load Data
Once you finish applying transformations, you can load the combined data into Power BI. Click Close & Apply in the Power Query Editor. Power BI imports the cleaned, combined dataset into your model.
You can monitor the reliability and performance of your data load process by using tools like SQL Server Profiler or Power Query diagnostics. These tools help you identify bottlenecks, track refresh durations, and optimize your queries for faster performance. Regularly reviewing performance metrics ensures that your solution remains efficient as your data grows.
Automation Advantage: Every time you refresh your Power BI report, Power Query scans the folder and combines any new files automatically. This automation eliminates manual intervention and keeps your reports up to date with the latest data.
By following these steps, you can combine files from a folder in Power BI with minimal effort and maximum reliability.
Troubleshooting
File Structure Issues
When you combine files in Power Query, you may encounter problems if the files do not share the same structure. Differences in columns, headers, or file formats can cause errors or missing data. You should always check that your files have consistent layouts before combining them. The table below highlights common structure issues and their impact:
If you notice missing columns or unexpected errors, review your sample file selection. Always choose a file with the most complete structure to avoid losing important data.
Filtering Unwanted Files
Filtering out unnecessary files early in your query helps you avoid errors and speeds up processing. You can filter by file extension, name, or date before Power Query applies the custom function. This step ensures only relevant files are processed.
Apply filters and remove unwanted columns as soon as possible to enable query folding. Query folding pushes filtering back to the source, reducing the data Power Query needs to process.
Use parameters for date ranges to limit the files included during development or refresh.
Remove unneeded columns from large tables to improve performance and compression.
Implement incremental refresh by partitioning data on date fields, so only recent data is refreshed.
Use query diagnostics tools to verify that your filters and folding are working as expected.
Tip: Filtering files before combining them prevents errors from incompatible formats and keeps your data model efficient.
Error Handling
You may face errors such as type mismatches, missing columns, or invalid data during the combine process. To handle these issues, validate data types and formats before processing. Use Power Query’s try...otherwise
functions to catch errors without stopping your workflow. Create custom error columns to flag problematic records for review. Logging errors helps you track and resolve recurring issues. Unit testing your queries and documenting changes ensures your solution remains robust as your data evolves. By applying these strategies, you maintain data integrity and keep your Power BI reports running smoothly.
Advanced Customization
Edit M Code
You can unlock advanced capabilities in Power Query by editing the automatically generated M code. The Advanced Editor allows you to fine-tune queries for precise and complex transformations. For example, you might combine data from multiple workbooks and sheets, handle files with inconsistent headers, or import sheets where data starts from different rows. You can also create reusable custom functions to standardize repetitive tasks and improve consistency. When you customize M code, you gain flexibility to address unique data challenges, such as unpivoting and consolidating data from multiple sources.
Tip: Always measure performance before and after making changes. Use tools like Power BI Performance Analyzer or DAX Studio to test your queries several times. This approach helps you validate improvements and avoid unexpected slowdowns.
Custom Columns
Custom columns let you enrich your data model by creating new fields based on existing data. You can use DAX formulas to categorize, flag, or generate new metrics that align with your business needs. For instance, you might classify customers into segments based on purchase history or calculate sales after tax with a formula like:
Sales After Tax = Sales Amount * (1 + Tax Rate)
Custom columns help you tailor your analysis and improve the precision of your insights. However, use them thoughtfully, especially with large datasets, since calculated columns consume memory and can impact performance.
Create custom columns to segment, flag, or derive new metrics.
Use clear naming conventions for easier maintenance and understanding.
Regularly review your custom columns to ensure they remain relevant and efficient.
Automation Tips
Automating your Power Query workflows ensures your data stays current and reliable. Set up scheduled refreshes to update your reports automatically. Maintain your data gateway configurations and keep credentials up to date to prevent refresh failures. Incremental refresh can optimize updates for large datasets, reducing network strain and improving refresh speed. You can also set up email alerts for refresh failures, helping you respond quickly to issues.
Automate refresh schedules for timely updates.
Use incremental refresh to reduce processing time.
Deploy alert systems to catch errors early.
Integrate version control to track changes and maintain reliability.
By applying these advanced techniques, you streamline your data processes, reduce manual effort, and ensure your Power BI solutions remain robust and scalable.
When you combine files with Power Query, you gain a streamlined, automated workflow that saves time and reduces errors. You can track your progress with key metrics such as faster refresh times, improved query performance, and positive user feedback. Explore advanced options like custom functions or incremental refresh to handle more complex scenarios. Next, try combining files from SharePoint or set up automated refresh schedules. Mastering these skills helps you deliver reliable, up-to-date data for better decision-making.
FAQ
How do you refresh combined data when new files are added?
Power Query automatically includes new files from the folder during each refresh. You only need to click Refresh in Power BI Desktop or schedule automatic refreshes in the Power BI Service.
Can you combine files from SharePoint or OneDrive?
Yes, you can connect to SharePoint or OneDrive folders using the appropriate connector. The process works the same as with local folders. You get the same automation and transformation features.
What file types can Power Query combine?
You can combine CSV, Excel, and text files. Power Query supports many formats, but you should filter out unsupported or irrelevant files before combining to avoid errors.
What should you do if files have different columns?
Choose a sample file with the most complete structure. Power Query will include all columns found in the sample. If some files lack columns, those fields will show as null in the combined data.