Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
appeljr
Frequent Visitor

Grouping multiple (uneven numbers of) rows of details about a publication as part of one pub

I'm a historian, not a datanaut, but I'm trying! My datasource (British Library's Engish Short-Title Catalogue) generates a two-column (character deliminated) running record of the exported publications. It makes clear sense to the human brain, since it looks like a card catalogue entry, and we can easily tell when a new publication card begins, how details are linked, etc. It would make clear sense in Access or any other database, but the project I'm working on is already invested in Excel.

 

Column1.1Column1.2
     Record number :         1
     ESTC Citation No.    W11611
     Author - personal    Duché, Jacob, 1738-1798.
     Title                The American vine, a sermon, preached in Christ-Church, Philadelphia, before the Honourable Continental Congress, July 20th, 1775. Being the day recommended by them for a general fast throughout the united English colonies of America. By the Reverend Jacob Duchë, M.A.
     Publisher/year       Philadelphia. : Printed by James Humphreys, Junior, M,DCC,LXXV. [1775]
     General note         Two states of gathering B noted. In one, B1, B2 are signed C, B2. In another, B1, B2 are signed B, C2. Two states of gatherings C and D noted. The first has "shine" for "descend" on p. 29, line 20, and an erratum note on p. 34 correcting this error. The second has "descend" o p. 29, and no erratum note.
     Uncontrolled note    Signatures: A⁴ A-C⁴ D² (D2 blank). - VERIFY DATA ON STATES OF GATHERINGS B,C,D. SEE EDWIN WOLF ARTICLE IN BSA PAPERS, V. 63, P. 264, WHICH SEEMS TO BE IN ERROR
     Citation/references  Evans 14012
     Citation/references  Sabin 21047
     Subject              United States -- History -- Revolution, 1775-1783.
     Added name           Humphreys, James, 1748-1810, printer.
     Added name           United States. Continental Congress.
     Added name           United States. Continental Congress.
     Record number :         2
     ESTC Citation No.    W479052
     Author - personal    Duché, Jacob, 1738-1798.
     Title                An anthem to be sung in the chapel of the Foundling Hospital, on Sunday the 20th of May, 1781, when a charity sermon will be preached by the Rev. Mr. Duché, Rector of Christ-Church and St. Peter's, Philadelphia.
     Publisher/year       [Philadelphia? : s.n., 1781]
     General note         Imprint conjectured from internal evidence.
     General note         Dimensions: 33 x 21 cm.
     General note         The text is in verse and consists of twenty-four lines organized into six stanzas; the first and last: "Chorus.", the second: "Recitative.", the third and fifth: "Solo.", and the fourth: "Duet."
     General note         The first line of text reads: "Almighty Lord! dispose each mind".
     Corporate subject    Pennsylvania Hospital (Philadelphia, Pa.)
     Subject              Anthems -- Early works to 1800.
     Subject              Charity-schools -- Pennsylvania -- Philadelphia -- Early works to 1800.
     Subject              Christian education of children -- Early works to 1800.
     Subject              Schools -- Pennsylvania -- Philadelphia -- Early works to 1800.
     Added name           |aPennsylvania Hospital (Philadelphia, Pa.)
     Added Title          Almighty Lord! dispose each mind.

So I can plug this into an already existing project with a layout where each publication takes up one row, I need some function to mark or label the different publication attributes so I can transpose:

Duché, Jacob, 1738-1798.AuthorPennsylvania: a poem. By a student of the College of Philadelphia.Philadelphia : Printed by B. Franklin, and D. Hall, MDCCLVI.[1756]Pennsylvania -- Poetry.ESTChttp://estc.bl.uk/W23024Evans, 7648

 

I've dug deep enough to be sure there is some solution, but every method I've seen either needs equal number of rows (these are unpredictable) or results in information being lost.

 

The repetitions in column 1.1 could be concatenated into a single entry (line break would be the best delim if possible, but otherwise bar), but I can't throw any details out.

 

If there is a way to "tag" each line with the ESTC number (which is unique to that publication), that would work, but every tutorial I've tried in the last two days to create an index, group, and then expand, doesn't get the job done.

 

Any help or directions would be most welcome!

1 ACCEPTED SOLUTION

Hi @appeljr 

please create a new query and paste this code into its advanced editor (replacing all existing code there):

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("xVbLbttGFP2VW60SgGIk2YkfXRQ0JUcO/BBEJU5hezEiR+I05IwwQ9pR0E3+pNuiu6JfkG77Ff2Snju0Dcl1kNgIWhCGxeG9577PnbOz1qdfmofGMjU2I12XU2lp9+acn1bgxajbughWNAbJJKZYVaJSRtOxCVfkT7vdF9078lFd5cZSmxbSOqNFsSLfr9Nc/vUxoFciNdOAulsb2+3u1s52uI4xUVUhV3277wHeJJcUldKqVGi6VFoGJMhJWxod0MJKAXMZKU1xbpWr2nFe2zQPaJSrQmSyWORKBDSVM2MlVQAbGm1qK6aFpNjoCoi6EgX/nlvpHByviyX1OlXOzm89D2lPKj33uplYkkV6y1LqDGanSz4uCeDwai61tICaCVfh3Jp6npu68pq1VhUUBnpeKJdTagqjlXRkZjfRwY5HQ/0upQV+k0HihP75W0BHYXQnhaN6ymDSPltKYf+VudUMhLRLI6t01Tj9SpSwPazLRW7l0seslbEwEvTjODh8+/ZNSGcc/cW6yZfXIWpT3VM8LteVIYdGakKbCwRkOXt7XiUL6UCT4SLudfHXI4GqODXX8CvmAy8gIAu9+4T2Aooh9BkzjmIoZ9S/scbdM1MW9ciFo/OWy1Hv85Yv2Hkrky5FovGOtl+E1NsJqIAAqh94HLSctFZUdekBr8U2NlE/iz6omr5QjqWMbcw59AdUG3srFm4MMK42a7h36vpaA6GypihktpZp5DdBGqCHTt2l6O+Pf1DUjvlf/9Pv9KTfo2kh9LunIWbzzWB8sP8j9aNJRCfHlEyiySChk316GU2G+Hb8MkE246AfUjIY0KB/enBMpyeH+xSNJwfx4YDwvpdENIpGg3ESEFrixQYGC2G82AzodHgQD1n1KKHJCe15+cF4fDJej+aGV55ZOeO+TqW7jmVwKbSj7man2/talURMMeq9bmdza10lqac/oSBfIJPXzRQmTee02zQEZRi75J8YO1PUbLaZe5DW9sadykQZT73G9HzWxOpQ8ZQx2CYYcLvbYb7iEbQPR11zPLyXuP4L0K9YLb0HrJbNrZ3O897/t1siphpP4JXBiiBXY57RX8zCaS4WsmB24bd9U+us4HEfGrdAOEXAZJDgVDSszQuDpY/Ekh3bBnNd5RIGGMmqanm9tOhKFQUbu91d01vaD+kIJOIJ/9eAk43eZMy11eYJJKlCGkm00nnd6fS23Pq+e9ieOFtV/QGLwoU6bGJ4KPsflL7DQY+apxE0ldHMmpJ826OYJC9VxgMdPhC5r7ByHXoIvLexQe/BAZSWD0Vhfq7k+4pA2agz9qyTPp9w2CHFfptUV5iBZXuGW4LfBji0c6HVB3/PQKc49Z6Xj/4g3Pe+dM2GYZwCq38XvB+jh2sXnrcCL9DsBP6AovphuJS3H7FAbOa1Z2pW5SyV4Hrgv/OptwBnmk/9Wlb49IjAGy/9fuMoOQ1owcwxalSUap6jSQ8x399RptzCIDXcoVQqXmB3Uh0buzDYYDw0a8zLFw+ptVsWYHclbueFnqzfyEYifPoYCo/8wHryHgiLm9qVse8cD3B3u9MJHwMZNwPadmluTOGh10Lg9xXfv61pnmzF94ysThuSRHFSmMuw+L6lqeRbRvcV2+Vn8cg+aLA/y+HcA1/oVjh78Q8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1.1 = _t, Column1.2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1.1", type text}, {"Column1.2", type text}}),
    TrimText = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),
    AddKey = Table.AddColumn(TrimText, "Key", each if [Column1.1] = "Record number :" then [Column1.2] else null),
    FillDownKey = Table.FillDown(AddKey,{"Key"}),
    PivotWithTextCombination = Table.Pivot(FillDownKey, List.Distinct(FillDownKey[Column1.1]), "Column1.1", "Column1.2", each Text.Combine(_, "#(lf)"))
in
    PivotWithTextCombination

A key-column with be created on the record number and the pivot-formula is tweaked in the 5th parameter to concatenate the multiple text fields with a linefeed.

 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

5 REPLIES 5
ImkeF
Super User
Super User

Hi @appeljr ,

could you please paste a table or linke to a file of your sample data instead of a picture? 

Thanks!

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

appeljr
Frequent Visitor

Edited the first post; sorry about that.

Hi @appeljr 

please create a new query and paste this code into its advanced editor (replacing all existing code there):

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("xVbLbttGFP2VW60SgGIk2YkfXRQ0JUcO/BBEJU5hezEiR+I05IwwQ9pR0E3+pNuiu6JfkG77Ff2Snju0Dcl1kNgIWhCGxeG9577PnbOz1qdfmofGMjU2I12XU2lp9+acn1bgxajbughWNAbJJKZYVaJSRtOxCVfkT7vdF9078lFd5cZSmxbSOqNFsSLfr9Nc/vUxoFciNdOAulsb2+3u1s52uI4xUVUhV3277wHeJJcUldKqVGi6VFoGJMhJWxod0MJKAXMZKU1xbpWr2nFe2zQPaJSrQmSyWORKBDSVM2MlVQAbGm1qK6aFpNjoCoi6EgX/nlvpHByviyX1OlXOzm89D2lPKj33uplYkkV6y1LqDGanSz4uCeDwai61tICaCVfh3Jp6npu68pq1VhUUBnpeKJdTagqjlXRkZjfRwY5HQ/0upQV+k0HihP75W0BHYXQnhaN6ymDSPltKYf+VudUMhLRLI6t01Tj9SpSwPazLRW7l0seslbEwEvTjODh8+/ZNSGcc/cW6yZfXIWpT3VM8LteVIYdGakKbCwRkOXt7XiUL6UCT4SLudfHXI4GqODXX8CvmAy8gIAu9+4T2Aooh9BkzjmIoZ9S/scbdM1MW9ciFo/OWy1Hv85Yv2Hkrky5FovGOtl+E1NsJqIAAqh94HLSctFZUdekBr8U2NlE/iz6omr5QjqWMbcw59AdUG3srFm4MMK42a7h36vpaA6GypihktpZp5DdBGqCHTt2l6O+Pf1DUjvlf/9Pv9KTfo2kh9LunIWbzzWB8sP8j9aNJRCfHlEyiySChk316GU2G+Hb8MkE246AfUjIY0KB/enBMpyeH+xSNJwfx4YDwvpdENIpGg3ESEFrixQYGC2G82AzodHgQD1n1KKHJCe15+cF4fDJej+aGV55ZOeO+TqW7jmVwKbSj7man2/talURMMeq9bmdza10lqac/oSBfIJPXzRQmTee02zQEZRi75J8YO1PUbLaZe5DW9sadykQZT73G9HzWxOpQ8ZQx2CYYcLvbYb7iEbQPR11zPLyXuP4L0K9YLb0HrJbNrZ3O897/t1siphpP4JXBiiBXY57RX8zCaS4WsmB24bd9U+us4HEfGrdAOEXAZJDgVDSszQuDpY/Ekh3bBnNd5RIGGMmqanm9tOhKFQUbu91d01vaD+kIJOIJ/9eAk43eZMy11eYJJKlCGkm00nnd6fS23Pq+e9ieOFtV/QGLwoU6bGJ4KPsflL7DQY+apxE0ldHMmpJ826OYJC9VxgMdPhC5r7ByHXoIvLexQe/BAZSWD0Vhfq7k+4pA2agz9qyTPp9w2CHFfptUV5iBZXuGW4LfBji0c6HVB3/PQKc49Z6Xj/4g3Pe+dM2GYZwCq38XvB+jh2sXnrcCL9DsBP6AovphuJS3H7FAbOa1Z2pW5SyV4Hrgv/OptwBnmk/9Wlb49IjAGy/9fuMoOQ1owcwxalSUap6jSQ8x399RptzCIDXcoVQqXmB3Uh0buzDYYDw0a8zLFw+ptVsWYHclbueFnqzfyEYifPoYCo/8wHryHgiLm9qVse8cD3B3u9MJHwMZNwPadmluTOGh10Lg9xXfv61pnmzF94ysThuSRHFSmMuw+L6lqeRbRvcV2+Vn8cg+aLA/y+HcA1/oVjh78Q8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1.1 = _t, Column1.2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1.1", type text}, {"Column1.2", type text}}),
    TrimText = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),
    AddKey = Table.AddColumn(TrimText, "Key", each if [Column1.1] = "Record number :" then [Column1.2] else null),
    FillDownKey = Table.FillDown(AddKey,{"Key"}),
    PivotWithTextCombination = Table.Pivot(FillDownKey, List.Distinct(FillDownKey[Column1.1]), "Column1.1", "Column1.2", each Text.Combine(_, "#(lf)"))
in
    PivotWithTextCombination

A key-column with be created on the record number and the pivot-formula is tweaked in the 5th parameter to concatenate the multiple text fields with a linefeed.

 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

appeljr
Frequent Visitor

Dear @ImkeF ,

That worked perfectly! 

 

But after many failures, I have to ask: is there a way to extract the date from the Publisher/year column into a new column, if there are variations thereof that include brackets, and where one can't assume that you can use last non-number delimiter? Sometimes the year has brackets, othertimes it doesn't, etc. Would something like last seven characters exluding a..z but allowing brackets and periods be a solution?

 

Consider these variations:

92v. ; 8⁰.Duché, Jacob, 1738-1798.Discourses on various subjects, by Jacob Duché, M. A. Rector of Christ-Church and St. Peter's, in Philadelphia; and Formerly of Clare-Hall, Cambridge. ...Dublin: printed by B. Dugdale, No. 150, Capel Street, M.DCC.LXXXVII. [1787]-88. Sermons, American.ESTChttp://estc.bl.uk/T107819 The second volume is dated 1788.  Church of England -- United States -- Sermons.
United Churches of Christ-Church and St. Peter's (Philadelphia, Pa.)
Church of England -- Pennsylvania -- Philadelphia -- Sermons.
      
102v.,plates ; 8⁰.Duché, Jacob, 1738-1798.Discourses on various subjects, by Jacob Duché, M.A. Formerly Rector of Christ-Church and St. Peter's, in Philadelphia; And Late Chaplain to the Asylum for Female Orphans, in the Parish of Lambeth, Surry. The third edition. To which are added two discourses preached at the chapel of the asylum, now first published. ...London: printed for T. Cadell, In The Strand, M.DCC.XC, 1790. Sermons, American.ESTChttp://estc.bl.uk/T127329    Church of England -- Pennsylvania -- Philadelphia -- Sermons.
Episcopal Church -- Sermons.
      
11[4],iv,25,[1]p. ; 8⁰.Duché, Jacob, 1738-1798.The duty of standing fast in our spiritual and temporal liberties, a sermon, preached in Christ-Church, July 7th, 1775. Before the First Battalion of the city and liberties of Philadelphia; and now published at their request. By the Reverend Jacob Duché, M.A.Philadelphia: Printed and sold by James Humphreys, Junior, the corner of Black-Horse Alley, Front-Street, M,DCC,LXXV. [1775] United States -- History -- Revolution, 1775-1783.ESTChttp://estc.bl.uk/W38276Evans 14013"Prayer before sermon."--p. [i]-iv.
Dedicated to George Washington.
Signatures: pi⁴ A-C⁴ D1 (D1 verso blank)Washington, George, 1732-1799, dedicatee.
Philadelphia (Pa.). First Battalion.
Philadelphia (Pa.). First Battalion.
Humphreys, James, 1748-1810, printer.
       
12[4],iii,[1],23,[1]p. ; 8⁰.Duché, Jacob, 1738-1798.The duty of standing fast in our spiritual and temporal liberties, a sermon, Preached in Christ-Church, July 7th, 1775. Before the First Battalion of the City and Liberties of Philadelphia; And published at their Request. By the Reverend Jacob Duche, M.A.[London]: Philadelphia printed: London re-printed, and sold by T. Evans No. 54. Pater-Noster Row, MDCCLXXV. [1775] United States -- History -- Revolution, 1775-1783.
Sermons, American -- 18th century.
ESTChttp://estc.bl.uk/T35104Adams, 75-42bPrice from imprint: price Sixpence.         

Hi @appeljr 

as this is a very different topic, please post a new thread for it and " @ImkeF " -me in, so I can anser it.

 

Please also mark my answer as answer to this thread, so it is closed.

Thx!

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors