Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
otjena
New Member

(Help) Working with large sets of multi dimensional JSON files from DocumentDB (Cosmos DB)

Hello

 

I'm currently trying to analyse quite a large set of multi dimensional JSON documents from a DocumentDB source. 

Because of the not so flat documents I'm trying to split it up in multiple tables (simple example below): 

 

{
    "id": 123,
    "success": false,
    "data": {
        "stats": {
            "data1": 123,
            "data2": 123,
            "data3": 123,
            "data4": 123,
            "data5": 123
          }
        },
        "tests":[
          {
            "data": "item1" 
          },
          {
            "data": "item2" 
          }
        ]
}

In this particular example, I would like to split the document up in 3 related tables, with id as key.

 

"Record" with id and the success coulumn.

"Stats" with id and data1 to data5

"Tests" with id and data

 

The problem I'm having is that for each table it queries the database, witch takes quite a while.

What I have tried to do is make one query and then create the tables by referencing the one query, but it seems that it still wants to make one query for each table. 

 

Is there something I have missed, or is there a mutch better and faster way to do this?

 

Any help is greatly appreciated!

 

 

1 ACCEPTED SOLUTION
v-shex-msft
Community Support
Community Support

Hi @otjena,

 

You can refer to below steps to manual analysis these data and split them to new tables:

 

1. Convert json source to table.

0.png

Query:

let
    Source = Json.Document(File.Contents("C:\Users\xxxxx\Desktop\new 5.json")),
    #"Converted to Table" = Record.ToTable(Source),
    #"Transposed Table" = Table.Transpose(#"Converted to Table"),
    #"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true])
in
    #"Promoted Headers"

 

 

2. Use Table.SelectColumns function to create the split tables.

 

1.PNG

 

3. Analysis and expand these data.

2.PNG

 

Full query:

Record:
let
    Source = Table.SelectColumns(SourceTable,{"id","success"})
in
    Source

Stats:
let
    Source = Table.SelectColumns(SourceTable,{"id","data"}),
    #"Added Custom" = Table.SelectColumns(Table.AddColumn(Source, "stats", each Record.FieldValues([data][stats])),{"id","stats"}),
    #"Expanded stats" = Table.ExpandListColumn(#"Added Custom", "stats")
in
    #"Expanded stats"

Tests:
let
    Source = Table.SelectColumns(SourceTable,{"id","tests"}),
    #"Expanded tests" = Table.ExpandRecordColumn(Table.ExpandListColumn(Source, "tests"), "tests", {"data"}, {"data"})
in
    #"Expanded tests"

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

View solution in original post

4 REPLIES 4
v-shex-msft
Community Support
Community Support

Hi @otjena,

 

You can refer to below steps to manual analysis these data and split them to new tables:

 

1. Convert json source to table.

0.png

Query:

let
    Source = Json.Document(File.Contents("C:\Users\xxxxx\Desktop\new 5.json")),
    #"Converted to Table" = Record.ToTable(Source),
    #"Transposed Table" = Table.Transpose(#"Converted to Table"),
    #"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true])
in
    #"Promoted Headers"

 

 

2. Use Table.SelectColumns function to create the split tables.

 

1.PNG

 

3. Analysis and expand these data.

2.PNG

 

Full query:

Record:
let
    Source = Table.SelectColumns(SourceTable,{"id","success"})
in
    Source

Stats:
let
    Source = Table.SelectColumns(SourceTable,{"id","data"}),
    #"Added Custom" = Table.SelectColumns(Table.AddColumn(Source, "stats", each Record.FieldValues([data][stats])),{"id","stats"}),
    #"Expanded stats" = Table.ExpandListColumn(#"Added Custom", "stats")
in
    #"Expanded stats"

Tests:
let
    Source = Table.SelectColumns(SourceTable,{"id","tests"}),
    #"Expanded tests" = Table.ExpandRecordColumn(Table.ExpandListColumn(Source, "tests"), "tests", {"data"}, {"data"})
in
    #"Expanded tests"

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Hello @v-shex-msft

First, thank you for your help, great guide and very easy to follow!

 

When I do it this way, and all other ways I can think of it seems that all of the tables actually queries my source.  (see below)

 

Update.png

 

Is it actually query the documentDB for each table or is it missleading? (it feels quite a bit slower than just running one table)

 

Best regards

Otjena

 

 

Hi @otjena,

 

I'd like to suggest you turn on the "paralla loading of tables" feature, it may increase the refresh preformance.

4.png

 

Notice: all of expand tables are based on source table, so these tables will waiting for refreshing of source table .

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Hello @v-shex-msft

 

That one was already active, but thanks for the tip!

 

It might be as you say, that the oter tables is in fact waiting for the source one and I am misslead to beleave that they all contact the database.

 

I will mark this one as answered, thanks for all the help you have been giving me!

 

Regards

Otjena

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.