You’re probably right re BBEdit vs other special tools - even Rich suggested 
this in the beginning of our conversation.

The question is: Which other tool exactly?

Excel would not import 3 million+ records 😔 and yes, I encountered the 
„helpful“ changes and some more too 😶

The Perl-CSV family of modules sounds like heaven 😉 - unfortunately I didn’t 
learn Perl yet 😶


> Am 26.03.2025 um 15:13 schrieb Bruce Van Allen <[email protected]>:
> 
> Properly formatted .csv files may have CR, LF, and CRLF end of line 
> constructs in records. However, this would be where the best approach would 
> be to use tools designed for the .csv protocol, rather than BBEdit’s manifold 
> text editor tools.
> 
> In Perl, we use the Text::CSV family of modules. You can specify the column 
> separator as the semi-colon in  your files, and read in the records as simple 
> arrays, with null or blank values handled. There’s also the database route if 
> you wanted to go a bit further with your Perl - DBI plus DBD::CSV.
> 
> Other modern languages have similar capabilities.
> 
> You could probably also open that file in Excel, if you want the GUI path. 
> Sometimes I do that just to take a look at a file’s structure. But beware 
> that Excel will convert some data such as dates to its own format, and make 
> other “helpful” changes like dropping leading zeroes - a problem with postal 
> codes, e.g. - unless you first designate all cells in the sheet as “text”.
> 
> HTH
> 
> ------ Original Message ------
> From "Vlad Ghitulescu" <[email protected] <mailto:[email protected]>>
> To [email protected] <mailto:[email protected]>
> Date 3/26/2025 6:47:17 AM
> Subject Re: How to filter / sort CSV-files by certain columns?
> 
>> The „disaster" was most probably caused by errors in the CSV-file: There is 
>> a " in the line just after the tragedy happens and 3 CRLF in the middle of 
>> some rows 😣
>> 
>> 
>> 
>>> Am 26.03.2025 um 09:29 schrieb Vlad Ghitulescu <[email protected]>:
>>> 
>>> The rearrange columns worked only on the first 25,816 lines
>>> 
>>> --
>>> This is the BBEdit Talk public discussion group. If you have a feature 
>>> request or believe that the application isn't working correctly, please 
>>> email "[email protected]" rather than posting here. Follow @bbedit on 
>>> Mastodon: <https://mastodon.social/@bbedit>
>>> ---
>>> You received this message because you are subscribed to the Google Groups 
>>> "BBEdit Talk" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To view this discussion visit 
>>> https://groups.google.com/d/msgid/bbedit/D78F9924-03B3-4117-9B3D-BC6614B5D1D5%40Ghitulescu.de.
>>> <CleanShot 2025-03-26 at 09.21.19.png>
>>> 
>>> The rest of the file was more or less mixed (see above). I’ll talk with the 
>>> support about it.
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> Am 26.03.2025 um 09:08 schrieb Vlad Ghitulescu <[email protected]>:
>>>> 
>>>> Hello again,
>>>> 
>>>> 
>>>> I changed all the semicolons to ",", changed then the grep ^ and $ into " 
>>>> and then voilá! the columns magically appeared:
>>>> 
>>>> <CleanShot 2025-03-26 at 08.57.09.png>
>>>> 
>>>> After this I could successfully rearrange the columns
>>>> 
>>>> <CleanShot 2025-03-26 at 09.01.19.png>
>>>> 
>>>> I then sorted all but the first line… and this totally destroyed the file
>>>> 
>>>> <CleanShot 2025-03-26 at 09.03.44.png>
>>>> 
>>>> (The copy of the file, that is - naturally! 😉)
>>>> 
>>>> I’ll start analyzing the new problem…
>>>> 
>>>> 
>>>> Regards,
>>>> Vlad
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> Am 26.03.2025 um 08:15 schrieb Vlad Ghitulescu <[email protected]>:
>>>>> 
>>>>> Hey Bruce,
>>>>> 
>>>>> 
>>>>> Thanks for the idea!
>>>>> 
>>>>> I changed the semicolon into tab
>>>>> 
>>>>> <CleanShot 2025-03-26 at 08.11.14.png>
>>>>> 
>>>>> but unfortunately that didn’t unlock the possibility to rearrange the 
>>>>> columns
>>>>> 
>>>>> <CleanShot 2025-03-26 at 08.11.50.png>
>>>>> 
>>>>> Any idea where I went wrong?
>>>>> 
>>>>> Thanks again!
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Vlad
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> Am 25.03.2025 um 23:49 schrieb Bruce Van Allen <[email protected]>:
>>>>>> 
>>>>>> Hey Vlad,
>>>>>> 
>>>>>> A BBEdit feature that might help is its ability to detect and manipulate 
>>>>>> *columns*. You’ll find it under the Edit menu. As far as I can tell, the 
>>>>>> columns it detects are tab or comma separated. A look at your sample 
>>>>>> showed no quote marks, but it did have a few commas; if you replace the 
>>>>>> semi-colon (‘;’) separators with tabs, then BBEdit will recognize your 
>>>>>> columns. How could that help? Well, one idea would be to move the 
>>>>>> column(s) you care about for search or sorting to the left-most 
>>>>>> position, greatly simplifying any pattern matching you want to try.
>>>>>> 
>>>>>> Always work on a copy of your file, of course.
>>>>>> 
>>>>>> HTH
>>>>>> 
>>>>>> ------ Original Message ------
>>>>>> From "GP" <[email protected]>
>>>>>> To "BBEdit Talk" <[email protected]>
>>>>>> Date 3/25/2025 2:32:00 PM
>>>>>> Subject Re: How to filter / sort CSV-files by certain columns?
>>>>>> 
>>>>>>> As a follow up...
>>>>>>> 
>>>>>>> BBEdit's Pattern Playground is a great help in constructing tedious 
>>>>>>> grep patterns like you'll need for your filtering and sorting needs. 
>>>>>>> The really tedious part is getting the field position(s) you want to 
>>>>>>> filter or sort on so you can modify that field's match pattern to 
>>>>>>> conform to the desired filter or sorting criteria.
>>>>>>> 
>>>>>>> For example... For your " Filter all lines that have ADR_CHK_KZ = 1" 
>>>>>>> using Text -> Process Lines Containing ... with the grep pattern:
>>>>>>> 
>>>>>>> 
>>>>>>> \d{3};\w{3};[^;]*;[^;]*;\d{10};\w{2};\d{2};\d{5};[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;(1);[^;]*;[^\n]*
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> will do the trick. For filtering you don't need the group capturing on 
>>>>>>> the 1 but it is useful with Pattern Playground to verify you're getting 
>>>>>>> the right field position and field contents matched.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> For your "Sort the file by MSGNO, ADRC_COUNTRY, ADRC_REGION, 
>>>>>>> ADRC_POST_CODE1, ADRC_CITY1, ADRC_CITY2, ADRC_STREET and 
>>>>>>> ADRC_HOUSE_NUM1" using Text -> Sort Lines ... with a grep pattern of:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> \d{3};\w{3};[^;]*;[^;]*;\d{10};(\w{2});(\d{2});(\d{5});([^;]*);[^;]*;([^;]*);([^;]*);([^;]*);[^;]*;\w{2};\d{2};\d{5};[^;]*;\d{12};[^;]*;[^;]*;\d{8};[^;]*;\d{12};[^;]*;[^;]*;\d;\d;\d;\d;\d;\d;\d;\d;([^;]*);[^\n]*
>>>>>>> 
>>>>>>> with "Specific sub-patterns" selected with \8\1\2\3\4\5\6\7 in the fill 
>>>>>>> in field will sort your example text using your desired field ordering.
>>>>>>> 
>>>>>>> On Tuesday, March 25, 2025 at 12:53:47 PM UTC-7 GP wrote:
>>>>>>>> For filtering, look at Text -> Process Lines Containing ... and for 
>>>>>>>> sorting Text -> Sort Lines ... using grep patterns to identify what 
>>>>>>>> you want to match for filtering and what subpattern field or fields 
>>>>>>>> you want to sort ordered on.
>>>>>>>> 
>>>>>>>> If the number of fields in your sample is representative of the real 
>>>>>>>> CSV files you're working with, it is going to be something of a pain 
>>>>>>>> in the rear coming up with the grep patterns needed to accomplish the 
>>>>>>>> desired filtering and sorting.
>>>>>>>> 
>>>>>>>> On Tuesday, March 25, 2025 at 11:03:35 AM UTC-7 Vlad Ghitulescu wrote:
>>>>>>>>> Hey,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I use BBEdit very often while working with big CSV-files (300 - 500 
>>>>>>>>> MB, up to 4 million rows) looking like this:
>>>>>>>>> 
>>>>>>>>> MANDT;BU;IDENTIFIER;OBJNR;ADRC_ADDRNUMBER;ADRC_COUNTRY;ADRC_REGION;ADRC_POST_CODE1;ADRC_CITY1;ADRC_CITY_EXT;ADRC_CITY2;ADRC_STREET;ADRC_HOUSE_NUM1;ADRC_HOUSE_NUM2;LOKAREF_COUNTRY;LOKAREF_REGION;LOKAREF_POST_CODE1;LOKAREF_CITY1;LOKAREF_CITY_CODE;LOKAREF_CITY_EXT;LOKAREF_CITY2;LOKAREF_CITYP_CODE;LOKAREF_STREET;LOKAREF_STRT_CODE;LOKAREF_HOUSE_NUM1;LOKAREF_HOUSE_NUM2;COUNTRY_KZ;REGION_KZ;POST_CODE1_KZ;CITY1_KZ;CITY_EXT_KZ;CITY2_KZ;STREET_KZ;ADR_CHK_KZ;MSGNO;MESSAGE
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723592;DE;09;86415;Mering;;Sankt
>>>>>>>>>  Afra;Egerländer Straße;;;DE;09;86415;Mering;500000002795;, 
>>>>>>>>> Schwab;Sankt 
>>>>>>>>> Afra;00000006;Egerländerstraße;910011919800;;;0;0;0;0;1;0;1;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723657;DE;09;85655;Aying;;Kaps;Kaps;;;DE;09;85653;Aying;500000002262;;Kaps;00000010;Kaps;700055566100;;;0;0;1;0;3;0;0;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723658;DE;09;83083;Riedering;;Patting;Patting;;;DE;09;83083;Riedering;500000002552;b
>>>>>>>>>  Rosenheim, Oberbay;Patting;00000037;Pattinger 
>>>>>>>>> Straße;910003809300;;;0;0;0;0;1;0;1;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723674;DE;09;85655;Aying;;Großhelfendorf;Hirschbergstraße;;;DE;09;85653;Aying;500000002262;;Großhelfendorf;00000007;Hirschbergstraße;910002873200;;;0;0;1;0;3;0;0;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723878;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
>>>>>>>>>  
>>>>>>>>> Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
>>>>>>>>>  Straße;910001339100;;;0;0;0;0;3;0;1;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723908;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
>>>>>>>>>  
>>>>>>>>> Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
>>>>>>>>>  Straße;910001339100;;;0;0;0;0;3;0;1;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723918;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
>>>>>>>>>  
>>>>>>>>> Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
>>>>>>>>>  Straße;910001339100;;;0;0;0;0;3;0;1;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007723956;DE;09;93336;Altmannstein;;Berghausen;Altmannsteiner
>>>>>>>>>  
>>>>>>>>> Str.;;;DE;09;93336;Altmannstein;500000005266;;Berghausen;00000003;Altmannsteiner
>>>>>>>>>  Straße;910001339100;;;0;0;0;0;3;0;1;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724554;DE;09;95131;Schwarzenbach
>>>>>>>>>  a.Wald;;Schwarzenbach a 
>>>>>>>>> Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach 
>>>>>>>>> a.Wald;500000011836;;Schwarzenbach 
>>>>>>>>> a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;
>>>>>>>>> 200;BAG;20250324080508_/ETN/PM_EAV_ADR_CHK_ADRC_V14157F;;0007724593;DE;09;95131;Schwarzenbach
>>>>>>>>>  a.Wald;;Schwarzenbach a 
>>>>>>>>> Wald;Walter-Münch-Straße;;;DE;09;95131;Schwarzenbach 
>>>>>>>>> a.Wald;500000011836;;Schwarzenbach 
>>>>>>>>> a.Wald;00000001;Walter-Münch-Straße;910007835500;;;0;0;0;0;3;1;0;1;;
>>>>>>>>> 
>>>>>>>>> Once in a while I’d like to filter or sort such huge files by one or 
>>>>>>>>> more columns, like:
>>>>>>>>> 
>>>>>>>>> 1. Filter all lines that have ADR_CHK_KZ = 1 or
>>>>>>>>> 2. Sort the file by MSGNO, ADRC_COUNTRY, ADRC_REGION, 
>>>>>>>>> ADRC_POST_CODE1, ADRC_CITY1, ADRC_CITY2, ADRC_STREET and 
>>>>>>>>> ADRC_HOUSE_NUM1.
>>>>>>>>> 
>>>>>>>>> Is there a way to do this sort of tasks with BBEdit?
>>>>>>>>> 
>>>>>>>>> Thanks!
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Vlad
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> This is the BBEdit Talk public discussion group. If you have a feature 
>>>>>>> request or believe that the application isn't working correctly, please 
>>>>>>> email "[email protected]" rather than posting here. Follow @bbedit 
>>>>>>> on Mastodon: <https://mastodon.social/@bbedit>
>>>>>>> ---
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "BBEdit Talk" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>>> an email to [email protected].
>>>>>>> To view this discussion visit 
>>>>>>> https://groups.google.com/d/msgid/bbedit/50130484-14eb-4298-b762-800f88b2c66en%40googlegroups.com<https://groups.google.com/d/msgid/bbedit/50130484-14eb-4298-b762-800f88b2c66en%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> - Bruce
>>>>>> 
>>>>>> _bruce__van_allen__santa_cruz__ca_
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> This is the BBEdit Talk public discussion group. If you have a feature 
>>>>>> request or believe that the application isn't working correctly, please 
>>>>>> email "[email protected]" rather than posting here. Follow @bbedit 
>>>>>> on Mastodon: <https://mastodon.social/@bbedit>
>>>>>> --- You received this message because you are subscribed to the Google 
>>>>>> Groups "BBEdit Talk" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>> an email to [email protected].
>>>>>> To view this discussion visit 
>>>>>> https://groups.google.com/d/msgid/bbedit/em0f3661a9-5cf2-410e-bf67-3da2c28d5975%40c8f72f7e.com.
>>>>> 
>>>> 
>>> 
>>> --
>>> This is the BBEdit Talk public discussion group. If you have a feature 
>>> request or believe that the application isn't working correctly, please 
>>> email "[email protected]" rather than posting here. Follow @bbedit on 
>>> Mastodon: <https://mastodon.social/@bbedit>
>>> ---
>>> You received this message because you are subscribed to the Google Groups 
>>> "BBEdit Talk" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To view this discussion visit 
>>> https://groups.google.com/d/msgid/bbedit/D78F9924-03B3-4117-9B3D-BC6614B5D1D5%40Ghitulescu.de.
>> 
> 
> 
> Thanks,
> 
>    - Bruce
> 
> _bruce__van_allen__santa_cruz__ca_
> 
> 
> -- 
> This is the BBEdit Talk public discussion group. If you have a feature 
> request or believe that the application isn't working correctly, please email 
> "[email protected]" rather than posting here. Follow @bbedit on Mastodon: 
> <https://mastodon.social/@bbedit>
> --- You received this message because you are subscribed to the Google Groups 
> "BBEdit Talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion visit 
> https://groups.google.com/d/msgid/bbedit/ema35f173d-e03b-43d0-84b2-ae4ce1b51f9d%40181c06df.com.

-- 
This is the BBEdit Talk public discussion group. If you have a feature request 
or believe that the application isn't working correctly, please email 
"[email protected]" rather than posting here. Follow @bbedit on Mastodon: 
<https://mastodon.social/@bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/bbedit/DE1323B5-2C8F-4A38-8F80-943A2A5808D3%40Ghitulescu.de.

Reply via email to