(no title)
kadder | 5 years ago
Have a look at some of the abstractions sagemaker has in place for A modeling workflow, abstracting featurizarion at that level is more beneficial than approaching the problem this via a core engineering driven mechanism
Based upon my experience on building a system like this, the percentage of Features reuses and searched across models is Generally lot smaller. A system which provides A publishing And a fast simple serving mechanism generally meets all the needs.
Metadata management, history, audit, lineage, search are all good to have but not critical requirements to most practitioners
Eg computing an aggregate lookup over a time range in spark, archive in S3, wrap it in a fast lookup implementation in a sickit learn transformer, and have the transformer pickle the lookup will give you offline , online parity out of box
No comments yet.