Package Builder is a system that extends query engines to support package generation. A package is a collection of tuples with certain global properties defined on the collection as a whole.
- Meal Planner:
An athlete needs to put together a dietary plan in preparation for a race. She wants a high-protein set of three meals for the day, that are between 2000 and 2500 calories in total. All meals should be gluten-free. It is easy to exclude meals that include gluten, as this condition can be checked for each meal (tuple) individually with a regular selection predicate. Other constraints need to be verified collectively over the entire package.
- Investment Portfolio:
A broker wants to construct an investment portfolio for one of her clients. The client has a budget of $50K, wants to invest at least 30% of the assets in technology, and wants a balance of short-term and long-term options. The broker cannot select each stock option individually, but rather needs to find a stock package that satisfies all these constraints collectively.
Don't existing DBMSs support package queries already?
With existing query languages, users can easily express base constraints, i.e., constraints that apply to every tuple in the query result. Global constraints are properties that a set of tuples satisfy as a whole. Unfortunately, these types of constraints are largely disregarded by traditional DBMSs and their query languages. Therefore, supporting them is a burden on the application level, rather than, as it should, on the database level.
PaQL — The Package Query Language
A Meal Planner example:
SELECT PACKAGE(R) AS P FROM Recipes R REPEAT 0 WHERE R.gluten-free = 'TRUE' SUCH THAT COUNT(*) = 3 AND SUM(calories) BETWEEN 2000 AND 2500 AND (SELECT COUNT(*) FROM P WHERE carbs > 0) >= COUNT(*)/2 MAXIMIZE SUM(protein)
Packages queries are more complex, semantically and algorithmically, compared to traditional database queries, and they pose challenges on several fronts. They can have complex specifications and they are hard to process by users given the large volume of results. Our package template abstraction encodes package specifications in a familiar tabular format. The system presents result packages to users in a way that allows them to meaningfully view the entire package space. Furthermore, PackageBuilder allows users to easily navigate the package space and to instruct the system about which constraints should be taken into account.
Here's a screenshot example of our package template abstraction (click on it to download the full-sized version):
Stochastic Packages Current Project
We must often make decisions in the face of uncertain data. Probabilistic databases excel at modeling and managing uncertainty. Our goal: stochastic optimization within a probabilistic database, close to the data.
A stochastic Portfolio Optimization example:
SELECT PACKAGE(R) FROM Assets SUCH THAT SUM(buy_price) = 1000 AND SUM(gain) >= 0 WITH PROBABILITY >= 90% MAXIMIZE EXPECTED SUM(gain)
- Lead Student Researcher
- Incremental Package Query Support
- Zhiru Zhu
- Merritt B. Kowaleski
- Past students and collaborators
Package queries: efficient and scalable computation of high-order constraints
Pre-print @ The VLDB Journal 2018, Special Issue on Best Papers of VDLB 2016
A Scalable Execution Engine for Package Queries
Paper @ SIGMOD Record 2017
ACM SIGMOD Research Highlight Award: Award website
Scalable Package Queries in Relational Database Systems
Paper @ VLDB 2016, Paper (Extended Version on arXiv), Poster, Slides
Best papers of VDLB 2016
Improving Package Recommendations through Query Relaxation
Workshop Paper @ Data4U 2014 in conjunction with VLDB 2014, Slides
PackageBuilder: From Tuples To Packages
Demo Paper @ VLDB 2014, Poster @ VLDB 2014, Challenge Poster @ VLDB 2014,
Poster @ NEDB 2014, Technical Report
PackageBuilder: Querying for packages of tuples
Undergraduate Research Poster @ SIGMOD 2014