top | item 45150752

(no title)

mccanne | 5 months ago

For sure... the use of JSON here seems orthogonal to Jamie's point of constructing complex nested values in SQL with a single backend query. Modern SQLs support all this as native types (presuming the result can fit in a homogeneous relational type, which it can in this case), e.g., Jamie's query can be written with a record expression (duckdb dialect):

  with theTitle as (
    from title.parquet
    where tconst = 'tt3890160'
  ),
  principals as (
    select array_agg({id:principal.nconst,name:primaryName,category:category})
    from principal.parquet, person.parquet
    where principal.tconst = (from theTitle select tconst)
    and person.nconst = principal.nconst
  ),
  characters as (
    select array_agg(c.character) as characters, p.u.name
    from principal_character.parquet c
    join (select unnest((from principals)) as u) p
      on c.character is not null and u.id=c.nconst and c.tconst=(select tconst from theTitle)
    group by p.u
  )
  select {
    title: (select primaryTitle from theTitle),
    director: list_transform(
                list_filter((from principals), lambda elem: elem.category='director'),
                lambda elem: elem.name),
    writer: list_transform(
              list_filter((from principals), lambda elem: elem.category='writer'),
                lambda elem: elem.name),
    genres: (select genres from theTitle),
    characters: (select array_agg({name:name,characters:characters}) from characters),
  } as result
And if you query typeof on the result, you'll get:

  STRUCT(
    title VARCHAR,
    director VARCHAR[],
    writer VARCHAR[],
    genres VARCHAR,
    characters STRUCT(
      "name" VARCHAR,
      characters VARCHAR[]
    )[]
  )

discuss

order

No comments yet.