- Dec 16, 2022
-
-
Matthias Kretz authored
After this change node::process_batch also works for node implementations that do not support simd arguments for process_one. ChangeLog: * graph.h (detail::wider_native_simd_size): Renamed from larger_native_simd_size. (detail::any_node): Finish concept testing for process_one signature. (detail::node_can_process_simd): New. (node::process_batch): Take SIMD branch only if Derived::process_one can be called with simd arguments. (merged_node::process_one(simd...)): Require that Left and Right process_one can be called with simd arguments.
-
Matthias Kretz authored
The new node::process_batch function calls process_one for processing as many elements as given via the input ranges. The function expects all input ranges to be of the same size. This is encoded as a pre-condition in the function body. The first argument to process_batch is a buffer that provides simple span<return_type> access. This is still an *incorrect* solution since return_type is a std::tuple of all output port types. A correct solution provides access to a span per type in the return_type tuple. If all input and output ranges are contiguous ranges, and if all input and output port types are vectorizable (in the std::experimental::simd definition), then process_batch calls process_one with simd<T> arguments instead of T. This is still missing a correct constraint on the process_one function, testing whether it actually works with simd arguments. ChangeLog: * ce.cpp (test1): New function. Tests the generic process_batch implementation and its capabilities to use stdx::simd. * graph.h (detail::precondition): New. (port_data): New. (node::process_batch_simd_epilogue): New. (node::process_batch): New. (adder::process_one): Accept simd<T, Abi> or T. (scale::process_one): Accept simd<T, Abi> or T. (merged_node::process_one): Accept simd<T, Abi> or T. * typelist.h (meta::transform_types): New. (meta::transform_value_type): New. (meta::reduce): New. (typelist::construct): New.
-
- Dec 15, 2022
-
-
Matthias Kretz authored
This is the correct default because the goal of merging nodes is to improve performance. If all intermediate results were returned (passed on) then the optimization would be greatly reduced. ChangeLog: * ce.cpp: Adjust for different return type of merged node. * graph.h (any_node): Move into detail namespace. Implement the concept, though the check for process_one does not compile yet. (node): Add return_type member type and make default constructor protected (never to be constructed directly). (duplicate): New node type that copies its input to `Count` outputs. (merged_node): Remove connected output port from output_ports. (merged_node::process_one): Move the result from the connected output port from apply_left into apply_right and remove it from the tuple returned from the function.
-
- Dec 12, 2022
-
-
Matthias Kretz authored
ChangeLog: * ce.cpp: New file. * graph.h: New file. * typelist.h: New file.
-