Fragment list consolidation for tiledb cloud arrays#5059
Fragment list consolidation for tiledb cloud arrays#5059
Conversation
677f049 to
f776d91
Compare
| throw_if_not_ok(fragment_consolidator->consolidate_fragments( | ||
| array_name, encryption_type, encryption_key, key_length, fragment_uris)); | ||
| // Consolidate | ||
| auto fragment_consolidator = |
There was a problem hiding this comment.
removed dynamic_cast that should be avoided when possible.
There was a problem hiding this comment.
Let's not do that now. If we're going to do this, let's change all the places that use Consolidator::create for something better and consistent.
There was a problem hiding this comment.
Why shouldn't we do that immediate improvement? This is the only occurrence of a dynamic_cast in consolidation code, and the reason is that consolidate_fragments is a method of FragmentConsolidator and not of its parent class Consolidator.
There was a problem hiding this comment.
After discussing offline, I am reverting this change, although I think it's improving the current situation, to get this long standing PR finally unblocked and merged.
| // Check if array exists | ||
| ObjectType obj_type; | ||
| throw_if_not_ok( | ||
| object_type(storage_manager->resources(), array_uri, &obj_type)); |
There was a problem hiding this comment.
moved this in the else case so that it's not executed for remote arrays and cause 2 extra REST requests (is_group and is_array)
f776d91 to
124ce98
Compare
124ce98 to
4e092ea
Compare
6b273c4 to
155a23c
Compare
| array_consolidation_request_builder->initFragments( | ||
| fragment_uris->size()); | ||
| for (size_t i = 0; i < fragment_uris->size(); i++) { | ||
| fragment_list_builder.set(i, (*fragment_uris)[i]); |
There was a problem hiding this comment.
those need to become relative to cross the wire, otherwise it's a security concern. Use serialize_array_uri_to_relative and deserialize_array_uri_to_absolute utility functions
155a23c to
3ab70bd
Compare
|
@KiterLuc we should discuss before merging this in case we first need to address this one first: https://app.shortcut.com/tiledb-inc/story/52302/increase-number-of-max-fragments-in-tiledb-fragment-list-consolidation |
| fragment_uris.reserve(fragment_reader.size()); | ||
| for (const auto& fragment_uri : fragment_reader) { | ||
| fragment_uris.emplace_back(fragment_uri); | ||
| fragment_uris.emplace_back(deserialize_array_uri_to_absolute( |
There was a problem hiding this comment.
REST CI is failing because the path normalization here is not happening on REST since go-capnp handles deserializing the request from generated source code. So we are serializing absolute URIs as relative URIs from core client-side, but REST does not call deserialize_consolidation_request and the switch back to absolute never happens.
We could add a tiledb_handle_array_consolidation_request much like tiledb_handle_load_array_schema_request that would just call deserialize_consolidation_request, but we'll need a TileDB-Go binding and subsequent REST PR to update the route to use it.
Another option without the handler and Go binding is to handle a second type of relative URIs in FragmentConsolidator::consolidate_fragments that is relative to the array_uri/ directory. The relative URIs handled in consolidate_fragments linked above is currently relative to the array_uri/__fragments directory.
There was a problem hiding this comment.
How about reconstructing the absolute URIs in HandleConsolidateArrayRequest on REST Server side by prepeding the array URI to arrayConsolidationRequest.Fragments before calling ConsolidateFragments ? Would that be possible? Looks like the least "invasive" solution.
There was a problem hiding this comment.
Opened this to track, we discussed and this is a feasible REST-Server side solution: https://app.shortcut.com/tiledb-inc/story/52679/fix-fragment-list-consolidation-handler-to-convert-relative-uris-to-absolute
Let's get this reviewed in the meantime when you have some spare time.
There was a problem hiding this comment.
This is done and I merged this branch to fix conflicts. I'll watch for CI failures but LGTM 👍
Co-authored-by: Shaun M Reed <shaunrd0@gmail.com>
a4c8c71 to
65c1812
Compare
|
@KiterLuc this is now ready for merging. |
[sc-48774]
This PR implements the needed changes on Core side for supporting fragment list cloud consolidation for
tiledb://arrays.In order for this feature to work end-to-end changes are also needed on the REST server side to deserialize the capnp request and call the right method (
consolidate_fragments) for actually doing the consolidation.TYPE: FEATURE
DESC: Fragment list consolidation for tiledb cloud arrays