cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
appeljr Frequent Visitor
Frequent Visitor

Grouping multiple (uneven numbers of) rows of details about a publication as part of one pub

I'm a historian, not a datanaut, but I'm trying! My datasource (British Library's Engish Short-Title Catalogue) generates a two-column (character deliminated) running record of the exported publications. It makes clear sense to the human brain, since it looks like a card catalogue entry, and we can easily tell when a new publication card begins, how details are linked, etc. It would make clear sense in Access or any other database, but the project I'm working on is already invested in Excel.

 

Column1.1Column1.2
     Record number :         1
     ESTC Citation No.    W11611
     Author - personal    Duché, Jacob, 1738-1798.
     Title                The American vine, a sermon, preached in Christ-Church, Philadelphia, before the Honourable Continental Congress, July 20th, 1775. Being the day recommended by them for a general fast throughout the united English colonies of America. By the Reverend Jacob Duchë, M.A.
     Publisher/year       Philadelphia. : Printed by James Humphreys, Junior, M,DCC,LXXV. [1775]
     General note         Two states of gathering B noted. In one, B1, B2 are signed C, B2. In another, B1, B2 are signed B, C2. Two states of gatherings C and D noted. The first has "shine" for "descend" on p. 29, line 20, and an erratum note on p. 34 correcting this error. The second has "descend" o p. 29, and no erratum note.
     Uncontrolled note    Signatures: A⁴ A-C⁴ D² (D2 blank). - VERIFY DATA ON STATES OF GATHERINGS B,C,D. SEE EDWIN WOLF ARTICLE IN BSA PAPERS, V. 63, P. 264, WHICH SEEMS TO BE IN ERROR
     Citation/references  Evans 14012
     Citation/references  Sabin 21047
     Subject              United States -- History -- Revolution, 1775-1783.
     Added name           Humphreys, James, 1748-1810, printer.
     Added name           United States. Continental Congress.
     Added name           United States. Continental Congress.
     Record number :         2
     ESTC Citation No.    W479052
     Author - personal    Duché, Jacob, 1738-1798.
     Title                An anthem to be sung in the chapel of the Foundling Hospital, on Sunday the 20th of May, 1781, when a charity sermon will be preached by the Rev. Mr. Duché, Rector of Christ-Church and St. Peter's, Philadelphia.
     Publisher/year       [Philadelphia? : s.n., 1781]
     General note         Imprint conjectured from internal evidence.
     General note         Dimensions: 33 x 21 cm.
     General note         The text is in verse and consists of twenty-four lines organized into six stanzas; the first and last: "Chorus.", the second: "Recitative.", the third and fifth: "Solo.", and the fourth: "Duet."
     General note         The first line of text reads: "Almighty Lord! dispose each mind".
     Corporate subject    Pennsylvania Hospital (Philadelphia, Pa.)
     Subject              Anthems -- Early works to 1800.
     Subject              Charity-schools -- Pennsylvania -- Philadelphia -- Early works to 1800.
     Subject              Christian education of children -- Early works to 1800.
     Subject              Schools -- Pennsylvania -- Philadelphia -- Early works to 1800.
     Added name           |aPennsylvania Hospital (Philadelphia, Pa.)
     Added Title          Almighty Lord! dispose each mind.

So I can plug this into an already existing project with a layout where each publication takes up one row, I need some function to mark or label the different publication attributes so I can transpose:

Duché, Jacob, 1738-1798.AuthorPennsylvania: a poem. By a student of the College of Philadelphia.Philadelphia : Printed by B. Franklin, and D. Hall, MDCCLVI.[1756]Pennsylvania -- Poetry.ESTChttp://estc.bl.uk/W23024Evans, 7648

 

I've dug deep enough to be sure there is some solution, but every method I've seen either needs equal number of rows (these are unpredictable) or results in information being lost.

 

The repetitions in column 1.1 could be concatenated into a single entry (line break would be the best delim if possible, but otherwise bar), but I can't throw any details out.

 

If there is a way to "tag" each line with the ESTC number (which is unique to that publication), that would work, but every tutorial I've tried in the last two days to create an index, group, and then expand, doesn't get the job done.

 

Any help or directions would be most welcome!

1 ACCEPTED SOLUTION

Accepted Solutions
Super User
Super User

Re: Grouping multiple (uneven numbers of) rows of details about a publication as part of one pub

Hi @appeljr 

please create a new query and paste this code into its advanced editor (replacing all existing code there):

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("xVbLbttGFP2VW60SgGIk2YkfXRQ0JUcO/BBEJU5hezEiR+I05IwwQ9pR0E3+pNuiu6JfkG77Ff2Snju0Dcl1kNgIWhCGxeG9577PnbOz1qdfmofGMjU2I12XU2lp9+acn1bgxajbughWNAbJJKZYVaJSRtOxCVfkT7vdF9078lFd5cZSmxbSOqNFsSLfr9Nc/vUxoFciNdOAulsb2+3u1s52uI4xUVUhV3277wHeJJcUldKqVGi6VFoGJMhJWxod0MJKAXMZKU1xbpWr2nFe2zQPaJSrQmSyWORKBDSVM2MlVQAbGm1qK6aFpNjoCoi6EgX/nlvpHByviyX1OlXOzm89D2lPKj33uplYkkV6y1LqDGanSz4uCeDwai61tICaCVfh3Jp6npu68pq1VhUUBnpeKJdTagqjlXRkZjfRwY5HQ/0upQV+k0HihP75W0BHYXQnhaN6ymDSPltKYf+VudUMhLRLI6t01Tj9SpSwPazLRW7l0seslbEwEvTjODh8+/ZNSGcc/cW6yZfXIWpT3VM8LteVIYdGakKbCwRkOXt7XiUL6UCT4SLudfHXI4GqODXX8CvmAy8gIAu9+4T2Aooh9BkzjmIoZ9S/scbdM1MW9ciFo/OWy1Hv85Yv2Hkrky5FovGOtl+E1NsJqIAAqh94HLSctFZUdekBr8U2NlE/iz6omr5QjqWMbcw59AdUG3srFm4MMK42a7h36vpaA6GypihktpZp5DdBGqCHTt2l6O+Pf1DUjvlf/9Pv9KTfo2kh9LunIWbzzWB8sP8j9aNJRCfHlEyiySChk316GU2G+Hb8MkE246AfUjIY0KB/enBMpyeH+xSNJwfx4YDwvpdENIpGg3ESEFrixQYGC2G82AzodHgQD1n1KKHJCe15+cF4fDJej+aGV55ZOeO+TqW7jmVwKbSj7man2/talURMMeq9bmdza10lqac/oSBfIJPXzRQmTee02zQEZRi75J8YO1PUbLaZe5DW9sadykQZT73G9HzWxOpQ8ZQx2CYYcLvbYb7iEbQPR11zPLyXuP4L0K9YLb0HrJbNrZ3O897/t1siphpP4JXBiiBXY57RX8zCaS4WsmB24bd9U+us4HEfGrdAOEXAZJDgVDSszQuDpY/Ekh3bBnNd5RIGGMmqanm9tOhKFQUbu91d01vaD+kIJOIJ/9eAk43eZMy11eYJJKlCGkm00nnd6fS23Pq+e9ieOFtV/QGLwoU6bGJ4KPsflL7DQY+apxE0ldHMmpJ826OYJC9VxgMdPhC5r7ByHXoIvLexQe/BAZSWD0Vhfq7k+4pA2agz9qyTPp9w2CHFfptUV5iBZXuGW4LfBji0c6HVB3/PQKc49Z6Xj/4g3Pe+dM2GYZwCq38XvB+jh2sXnrcCL9DsBP6AovphuJS3H7FAbOa1Z2pW5SyV4Hrgv/OptwBnmk/9Wlb49IjAGy/9fuMoOQ1owcwxalSUap6jSQ8x399RptzCIDXcoVQqXmB3Uh0buzDYYDw0a8zLFw+ptVsWYHclbueFnqzfyEYifPoYCo/8wHryHgiLm9qVse8cD3B3u9MJHwMZNwPadmluTOGh10Lg9xXfv61pnmzF94ysThuSRHFSmMuw+L6lqeRbRvcV2+Vn8cg+aLA/y+HcA1/oVjh78Q8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1.1 = _t, Column1.2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1.1", type text}, {"Column1.2", type text}}),
    TrimText = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),
    AddKey = Table.AddColumn(TrimText, "Key", each if [Column1.1] = "Record number :" then [Column1.2] else null),
    FillDownKey = Table.FillDown(AddKey,{"Key"}),
    PivotWithTextCombination = Table.Pivot(FillDownKey, List.Distinct(FillDownKey[Column1.1]), "Column1.1", "Column1.2", each Text.Combine(_, "#(lf)"))
in
    PivotWithTextCombination

A key-column with be created on the record number and the pivot-formula is tweaked in the 5th parameter to concatenate the multiple text fields with a linefeed.

 

 

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




View solution in original post

5 REPLIES 5
Super User
Super User

Re: Grouping multiple (uneven numbers of) rows of details about a publication as part of one pub

Hi @appeljr ,

could you please paste a table or linke to a file of your sample data instead of a picture? 

Thanks!

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




Highlighted
appeljr Frequent Visitor
Frequent Visitor

Re: Grouping multiple (uneven numbers of) rows of details about a publication as part of one pub

Edited the first post; sorry about that.

Super User
Super User

Re: Grouping multiple (uneven numbers of) rows of details about a publication as part of one pub

Hi @appeljr 

please create a new query and paste this code into its advanced editor (replacing all existing code there):

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("xVbLbttGFP2VW60SgGIk2YkfXRQ0JUcO/BBEJU5hezEiR+I05IwwQ9pR0E3+pNuiu6JfkG77Ff2Snju0Dcl1kNgIWhCGxeG9577PnbOz1qdfmofGMjU2I12XU2lp9+acn1bgxajbughWNAbJJKZYVaJSRtOxCVfkT7vdF9078lFd5cZSmxbSOqNFsSLfr9Nc/vUxoFciNdOAulsb2+3u1s52uI4xUVUhV3277wHeJJcUldKqVGi6VFoGJMhJWxod0MJKAXMZKU1xbpWr2nFe2zQPaJSrQmSyWORKBDSVM2MlVQAbGm1qK6aFpNjoCoi6EgX/nlvpHByviyX1OlXOzm89D2lPKj33uplYkkV6y1LqDGanSz4uCeDwai61tICaCVfh3Jp6npu68pq1VhUUBnpeKJdTagqjlXRkZjfRwY5HQ/0upQV+k0HihP75W0BHYXQnhaN6ymDSPltKYf+VudUMhLRLI6t01Tj9SpSwPazLRW7l0seslbEwEvTjODh8+/ZNSGcc/cW6yZfXIWpT3VM8LteVIYdGakKbCwRkOXt7XiUL6UCT4SLudfHXI4GqODXX8CvmAy8gIAu9+4T2Aooh9BkzjmIoZ9S/scbdM1MW9ciFo/OWy1Hv85Yv2Hkrky5FovGOtl+E1NsJqIAAqh94HLSctFZUdekBr8U2NlE/iz6omr5QjqWMbcw59AdUG3srFm4MMK42a7h36vpaA6GypihktpZp5DdBGqCHTt2l6O+Pf1DUjvlf/9Pv9KTfo2kh9LunIWbzzWB8sP8j9aNJRCfHlEyiySChk316GU2G+Hb8MkE246AfUjIY0KB/enBMpyeH+xSNJwfx4YDwvpdENIpGg3ESEFrixQYGC2G82AzodHgQD1n1KKHJCe15+cF4fDJej+aGV55ZOeO+TqW7jmVwKbSj7man2/talURMMeq9bmdza10lqac/oSBfIJPXzRQmTee02zQEZRi75J8YO1PUbLaZe5DW9sadykQZT73G9HzWxOpQ8ZQx2CYYcLvbYb7iEbQPR11zPLyXuP4L0K9YLb0HrJbNrZ3O897/t1siphpP4JXBiiBXY57RX8zCaS4WsmB24bd9U+us4HEfGrdAOEXAZJDgVDSszQuDpY/Ekh3bBnNd5RIGGMmqanm9tOhKFQUbu91d01vaD+kIJOIJ/9eAk43eZMy11eYJJKlCGkm00nnd6fS23Pq+e9ieOFtV/QGLwoU6bGJ4KPsflL7DQY+apxE0ldHMmpJ826OYJC9VxgMdPhC5r7ByHXoIvLexQe/BAZSWD0Vhfq7k+4pA2agz9qyTPp9w2CHFfptUV5iBZXuGW4LfBji0c6HVB3/PQKc49Z6Xj/4g3Pe+dM2GYZwCq38XvB+jh2sXnrcCL9DsBP6AovphuJS3H7FAbOa1Z2pW5SyV4Hrgv/OptwBnmk/9Wlb49IjAGy/9fuMoOQ1owcwxalSUap6jSQ8x399RptzCIDXcoVQqXmB3Uh0buzDYYDw0a8zLFw+ptVsWYHclbueFnqzfyEYifPoYCo/8wHryHgiLm9qVse8cD3B3u9MJHwMZNwPadmluTOGh10Lg9xXfv61pnmzF94ysThuSRHFSmMuw+L6lqeRbRvcV2+Vn8cg+aLA/y+HcA1/oVjh78Q8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1.1 = _t, Column1.2 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1.1", type text}, {"Column1.2", type text}}),
    TrimText = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),
    AddKey = Table.AddColumn(TrimText, "Key", each if [Column1.1] = "Record number :" then [Column1.2] else null),
    FillDownKey = Table.FillDown(AddKey,{"Key"}),
    PivotWithTextCombination = Table.Pivot(FillDownKey, List.Distinct(FillDownKey[Column1.1]), "Column1.1", "Column1.2", each Text.Combine(_, "#(lf)"))
in
    PivotWithTextCombination

A key-column with be created on the record number and the pivot-formula is tweaked in the 5th parameter to concatenate the multiple text fields with a linefeed.

 

 

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




View solution in original post

appeljr Frequent Visitor
Frequent Visitor

Re: Grouping multiple (uneven numbers of) rows of details about a publication as part of one pub

Dear @ImkeF ,

That worked perfectly! 

 

But after many failures, I have to ask: is there a way to extract the date from the Publisher/year column into a new column, if there are variations thereof that include brackets, and where one can't assume that you can use last non-number delimiter? Sometimes the year has brackets, othertimes it doesn't, etc. Would something like last seven characters exluding a..z but allowing brackets and periods be a solution?

 

Consider these variations:

92v. ; 8⁰.Duché, Jacob, 1738-1798.Discourses on various subjects, by Jacob Duché, M. A. Rector of Christ-Church and St. Peter's, in Philadelphia; and Formerly of Clare-Hall, Cambridge. ...Dublin: printed by B. Dugdale, No. 150, Capel Street, M.DCC.LXXXVII. [1787]-88. Sermons, American.ESTChttp://estc.bl.uk/T107819 The second volume is dated 1788.  Church of England -- United States -- Sermons.
United Churches of Christ-Church and St. Peter's (Philadelphia, Pa.)
Church of England -- Pennsylvania -- Philadelphia -- Sermons.
      
102v.,plates ; 8⁰.Duché, Jacob, 1738-1798.Discourses on various subjects, by Jacob Duché, M.A. Formerly Rector of Christ-Church and St. Peter's, in Philadelphia; And Late Chaplain to the Asylum for Female Orphans, in the Parish of Lambeth, Surry. The third edition. To which are added two discourses preached at the chapel of the asylum, now first published. ...London: printed for T. Cadell, In The Strand, M.DCC.XC, 1790. Sermons, American.ESTChttp://estc.bl.uk/T127329    Church of England -- Pennsylvania -- Philadelphia -- Sermons.
Episcopal Church -- Sermons.
      
11[4],iv,25,[1]p. ; 8⁰.Duché, Jacob, 1738-1798.The duty of standing fast in our spiritual and temporal liberties, a sermon, preached in Christ-Church, July 7th, 1775. Before the First Battalion of the city and liberties of Philadelphia; and now published at their request. By the Reverend Jacob Duché, M.A.Philadelphia: Printed and sold by James Humphreys, Junior, the corner of Black-Horse Alley, Front-Street, M,DCC,LXXV. [1775] United States -- History -- Revolution, 1775-1783.ESTChttp://estc.bl.uk/W38276Evans 14013"Prayer before sermon."--p. [i]-iv.
Dedicated to George Washington.
Signatures: pi⁴ A-C⁴ D1 (D1 verso blank)Washington, George, 1732-1799, dedicatee.
Philadelphia (Pa.). First Battalion.
Philadelphia (Pa.). First Battalion.
Humphreys, James, 1748-1810, printer.
       
12[4],iii,[1],23,[1]p. ; 8⁰.Duché, Jacob, 1738-1798.The duty of standing fast in our spiritual and temporal liberties, a sermon, Preached in Christ-Church, July 7th, 1775. Before the First Battalion of the City and Liberties of Philadelphia; And published at their Request. By the Reverend Jacob Duche, M.A.[London]: Philadelphia printed: London re-printed, and sold by T. Evans No. 54. Pater-Noster Row, MDCCLXXV. [1775] United States -- History -- Revolution, 1775-1783.
Sermons, American -- 18th century.
ESTChttp://estc.bl.uk/T35104Adams, 75-42bPrice from imprint: price Sixpence.         
Super User
Super User

Re: Grouping multiple (uneven numbers of) rows of details about a publication as part of one pub

Hi @appeljr 

as this is a very different topic, please post a new thread for it and " @ImkeF " -me in, so I can anser it.

 

Please also mark my answer as answer to this thread, so it is closed.

Thx!

 

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




Helpful resources

Announcements
Can You Solve These Challenge

Challenge: Can You Solve These?

Find out how to participate in the first Power BI 'Can You Solve These?' challenge.

Community News & Announcements

Community News & Announcements

Get your latest community news and announcements.

Virtual Launch Event

Microsoft Business Applications October Virtual Launch Event

Join us for an in-depth look at the new innovations across Dynamics 365 and the Microsoft Power Platform.

Community Kudopalooza

Win Power BI Swag with Community Kudopalooza!

Each week, complete activities and be qualified in the drawing for cool Power BI Swag.

Users Online
Currently online: 162 members 1,950 guests
Please welcome our newest community members: