(no title)
dlkmp | 11 months ago
Translating the examples from the ReadMe, having read the file with:
$medias = Get-Content .\medias.csv | ConvertFrom-Csv
Previewing the file in the terminal xan view medias.csv
$medias | Format-Table
Reading a flattened representation of the first row xan flatten -c medias.csv
$medias | Format-List
Searching for rows xan search -s outreach internationale medias.csv | xan view
$medias | Where-Object { $_.outreach -eq "internationale" } | Format-Table
Selecting some columns xan select foundation_year,name medias.csv | xan view
$medias | Select-Object -Property foundation_year, name | Format-Table
Sorting the file xan sort -s foundation_year medias.csv | xan view -s name,foundation_year
$medias | Sort-Object -Property foundation_year | Select-Object -Property name, foundation_year | Format-Table
Deduplicating the file on some column # Some medias of our corpus have the same ids on mediacloud.org
xan dedup -s mediacloud_ids medias.csv | xan count && xan count medias.csv
$medias | Select-Object -ExpandProperty mediacloud_ids -Unique | Measure-Object; $medias | Measure-Object -Property mediacloud_ids
Computing frequency tables xan frequency -s edito medias.csv | xan view
$medias | Group-Object -Property edito | Sort-Object -Property Count -Descending
It's probably orders of magnitude slower, and of course, plotting graphs and so on gets tricky. But for the simple type of analysis I typically do, it's fast enough, I don't need to learn an extra tool, and the auto-completion of column/property names is very convenient.
account-5|11 months ago
I'm currently on my phone so can't go through all the examples, but knowing both PS and nu, nu has the better syntax.
EDIT:
Get data and view in table:
Get headers: Get count of rows: Get flattened, slight more convoluted (caveat there might be a better way): Search rows: Select columns: Sort file: Dedup based on column: Computing frequency and histogramSwamyM|11 months ago