top of page

You are learning Power Query in MS Excel

How to connect to and import data from web pages using Power Query?

Connecting to and importing data from web pages using Power Query involves several steps. Here’s a detailed guide:

Step-by-Step Process

1. Open Power Query:
- In Excel, go to the `Data` tab.
- Select `Get Data` > `From Other Sources` > `From Web`.

2. Enter Web URL:
- In the dialog box that appears, enter the URL of the web page from which you want to import data.
- Click `OK`.

3. Authentication (if required):
- If the web page requires authentication, you'll be prompted to enter credentials. Choose the appropriate authentication method (Anonymous, Basic, Web API, etc.).

4. Navigator Window:
- After connecting, the Navigator window will display, showing the tables and data elements that Power Query found on the web page.
- Select the table or data element you want to import. You can preview the data in the right pane.

5. Load or Transform Data:
- Click `Load` to load the data directly into Excel.
- Click `Transform Data` to open the Power Query Editor for further data shaping and cleaning.

Using the Power Query Editor

1. Cleaning and Transforming Data:
- In the Power Query Editor, you can apply various transformations to clean and shape your data. Common transformations include:
- Removing Columns/Rows: Right-click on a column or row and select `Remove`.
- Changing Data Types: Select a column and go to the `Transform` tab > `Data Type`.
- Filtering Rows: Use the filter icons in the column headers to filter rows.
- Replacing Values: Right-click on a column and select `Replace Values`.

2. Combining Data:
- If the data you need spans multiple tables or pages, you can combine them using `Merge Queries` or `Append Queries`.

Example of Importing Data from a Web Page

Let’s go through a concrete example:

1. Example URL:
- Assume you want to import data from a Wikipedia page: `https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)`.

2. Process:
- Get Data: Go to `Data` > `Get Data` > `From Other Sources` > `From Web`.
- Enter URL: Enter the URL `https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(nominal)` and click `OK`.
- Select Table: In the Navigator window, select the table that contains the GDP data. For instance, select the table with country names and GDP values.
- Transform Data: Click `Transform Data` to open the Power Query Editor.
- Clean Data: In the Power Query Editor, you might:
- Remove unnecessary columns.
- Change the data type of GDP values to `Decimal Number`.
- Filter out any irrelevant rows.

3. Load Data:
- After cleaning and transforming the data, click `Close & Load` to load the data into Excel.

Handling Pagination and Dynamic Content

- Pagination: If the data spans multiple pages, you can use Power Query’s advanced features to handle pagination. This typically involves creating a custom function to iterate over multiple URLs.
- Dynamic Content: For web pages that require interaction (e.g., clicking buttons to load data), consider using APIs if available or tools like Power Automate to extract data.

Tips

- Inspect the Web Page: Use browser developer tools to inspect the structure of the web page and identify the elements containing the data you need.
- APIs: If the website provides an API, consider using it for more reliable and structured data retrieval.
- Web Scraping Tools: For more complex scenarios, external web scraping tools may be necessary, which can then be integrated with Power Query.

By following these steps, you can effectively connect to and import data from web pages using Power Query.

bottom of page