labath added a comment.
I have a lot of comments. The two major ones are:
- i think the way you link the tests is in the UB territory. I explain this in
detail in one of the inline comments.
- I believe that your unit tests (not just in this patch) focus too much on
testing the behavior of a single method, even when that method does not have an
easily observable results. For this you then often need to expose internals of
the tested class to be able to test some effect of that method on the internal
state. This not all bad (i've done it myself sometimes), and it's definitely
better that not writing any unit tests. However, it's generally better to test
the public interface of a method/class/entity, and in this case, I believe it
should be possible. I'm imagining some tests like having a dummy class with a
bunch of methods which are annotated with the SB_RECORD macros. Then, in the
test you call the methods of this class with some arguments, have the repro
framework serialize the calls to a (in-memory?) stream, and verify the contents
of that stream. This will test a lot more code -- the (De)Serialize functions
are just fancy ways of invoking memcpy, but if you take all of that together
with the SB_RECORD, SB_REGISTER macros, it becomes some pretty deep magic that
is interesting to test exhaustively (it's fine, but not really interesting to
test that Deserialize<T*> and Deserialize<T&> do the right thing for each of
the possible fundamental types.) And if you use the deserializer interface to
read out the recorded traces (instead of comparing raw bytes), then you can
avoid depending on the endianness of the values.
Conversely, you can write a trace file with the serializer interface, and then
tell the replayer to invoke the mock class which will verify that it was called
with the right arguments.
================
Comment at: source/API/SBReproducer.cpp:1
+//===-- SBReproducer.cpp ----------------------------------------*- C++
-*-===//
+//
----------------
I found it weird to have one cpp file implementing two headers
(`SBReproducer.h` and `SBReproducerPrivate.h`). Can you split it into two
files? (This will come out naturally, if we split this up into modules as I
mention in one of the other comments.)
================
Comment at: source/API/SBReproducer.cpp:69
+ repro::Loader *loader = repro::Reproducer::Instance().GetLoader();
+ if (!loader) {
+ return false;
----------------
drop `{}` (that's what you seem to do elsewhere in this file).
================
Comment at: source/API/SBReproducerPrivate.h:44
+ void *object = GetObjectForIndexImpl(idx);
+ return static_cast<typename std::remove_const<T>::type *>(object);
+ }
----------------
Why do you need to `remove_const` here? If `T` is `const`, then const will be
added back by the return statement anyway. If `T` is not `const`, then
`remove_const` is a noop.
================
Comment at: source/API/SBReproducerPrivate.h:65-69
+ auto it = m_mapping.find(idx);
+ if (it == m_mapping.end()) {
+ return nullptr;
+ }
+ return m_mapping[idx];
----------------
replace by `m_mapping.lookup(idx)`, at which point you can consider dropping
this function entirely.
================
Comment at: source/API/SBReproducerPrivate.h:82-93
+/// Base class for tag dispatch used in the SBDeserializer. Different tags are
+/// instantiated with different values.
+template <unsigned> struct SBTag {};
+
+/// We need to differentiate between pointers to fundamental and
+/// non-fundamental types. See the corresponding SBDeserializer::Read method
+/// for the reason why.
----------------
Just curious: What's the advantage of this over just declaring a bunch of Tag
classes directly (struct PointerTag{}; struct ReferenceTag{}; ...)?
================
Comment at: source/API/SBReproducerPrivate.h:121
+public:
+ SBDeserializer(llvm::StringRef buffer = {}) : m_buffer(buffer), m_offset(0)
{}
+
----------------
Instead of the m_offset variable, what do you think about just `drop_front`ing
the appropriate amount of bytes when you are done with the? It doesn't look
like you'll ever need to go back to an earlier point in the stream...
================
Comment at: source/API/SBReproducerPrivate.h:155-161
+ // FIXME: We have references to this instance stored in replayer instance. We
+ // should find a better way to swap out the buffer after this instance has
+ // been created, but his will have to do for now.
+ void LoadBuffer(llvm::StringRef buffer) {
+ m_buffer = buffer;
+ m_offset = 0;
+ }
----------------
It sounds to me like this could be solved by passing the deserializer to the
`operator()` of the SBReplayers instead of having it a member variable set by
the constructor. This design also makes more sense to me, as theoretically
there is no reason why all replay calls would have to come from the same
deserializer object. You may even want to have a separate deserializer for each
thread when you get around to replaying multithreaded recordings.
After this, it may even be possible to make the deserializer object be a local
variable in the `SBRegistry::Replay` function.
================
Comment at: source/API/SBReproducerPrivate.h:167
+private:
+ template <typename T> T Read(ValueTag) {
+ T t;
----------------
I guess if we don't consider the recording to be "input" then we don't have to
handle the case of not having enough bytes in the file extremely gracefully
here, but it would still be nice to at least assert that we are not running off
the end of the buffer here.
================
Comment at: source/API/SBReproducerPrivate.h:169
+ T t;
+ std::memcpy((char *)&t, &m_buffer.data()[m_offset], sizeof(t));
+ m_offset += sizeof(t);
----------------
reinterpret_cast
================
Comment at: source/API/SBReproducerPrivate.h:182
+ typedef typename std::remove_reference<T>::type UnderlyingT;
+ // If this is a reference to a fundamental type we just read its value.
+ return *m_index_to_object.template GetObjectForIndex<UnderlyingT>(
----------------
Is this correct? I would expect the fundamental references to not go through
this overload at all...
================
Comment at: source/API/SBReproducerPrivate.h:303
+ DeserializationHelper<Args...>::template deserialized<Result>::doit(
+ m_deserializer, f));
+ }
----------------
Shouldn't this call `g` then?
================
Comment at: source/API/SBReproducerPrivate.h:307-309
+ Result (*f)(Args...);
+ /// A custom function.
+ Result (*g)(Args...);
----------------
If you gave these more meaningful names, then the comments would be unneeded.
================
Comment at: source/API/SBReproducerPrivate.h:370
+ /// Register the given replayer for a function (and the ID mapping).
+ void DoRegister(uintptr_t RunID, SBReplayer *replayer, unsigned id) {
+ m_sbreplayers[RunID] = std::make_pair(replayer, id);
----------------
It looks like this `id` argument isn't really needed here, as the DoRegister
function can just synthesize the id on its own. In fact, the `m_id` is probably
not needed either, as you can just use `m_sbreplayers.size()+1` or something.
================
Comment at: source/API/SBReproducerPrivate.h:371
+ void DoRegister(uintptr_t RunID, SBReplayer *replayer, unsigned id) {
+ m_sbreplayers[RunID] = std::make_pair(replayer, id);
+ m_ids[id] = replayer;
----------------
`assert(m_sbreplayers.find(RunID) == end())` to make sure our RunID magic
didn't do something funny?
================
Comment at: source/API/SBReproducerPrivate.h:419-448
+/// Maps an object to an index for serialization. Indices are unique and
+/// incremented for every new object.
+///
+/// Indices start at 1 in order to differentiate with an invalid index (0) in
+/// the serialized buffer.
+class SBObjectToIndex {
+public:
----------------
What's the threading scenario in which you expect this to be called? If
multiple threads can call `GetIndexForObject` simultaneously (even if the
objects differ), then I think you need to guard `m_mapping` with the mutex too,
not just the integer variable.
BTW: The `m_index` can also just be `m_mapping.size()+1`
================
Comment at: source/API/SBReproducerPrivate.h:453
+public:
+ SBSerializer(llvm::raw_ostream &stream = llvm::outs()) : m_stream(stream) {}
+
----------------
Is this default value used for anything? It don't see it being used, and it
doesn't seem particularly useful anyway, as you'd just get a lot of binary junk
on stdout.
================
Comment at: source/API/SBReproducerPrivate.h:482
+ if (std::is_fundamental<T>::value) {
+ Serialize(t);
+ } else {
----------------
Will this ever not be an infinite recursion? I am guessing this works because
all fundamental types are picked up by the specific overloads below before this
function is even invoked.
Maybe if you just inline the `m_stream.write(reinterpret_cast<const char
*>(&t), sizeof(Type))` thingy here, you can avoid having to define a special
Serialize function for all fundamental types.
================
Comment at: source/API/SBReproducerPrivate.h:484
+ } else {
+ int idx = m_tracker.GetIndexForObject(&t);
+ Serialize(idx);
----------------
It looks like `GetIndexForObject` returns `unsigned`
================
Comment at: source/API/SBReproducerPrivate.h:562-563
+ m_local_boundary(false), m_result_recorded(false) {
+ if (!g_global_boundary) {
+ g_global_boundary = true;
+ m_local_boundary = true;
----------------
If you wanted to make the updates to g_global_boundary thread-safe, this is not
the way to achieve that. This is still going to be racy, so you should at least
use the atomic compare-and-exchange operations. However, I don't believe this
will still be right thing to do in the multithreaded case, so it may be best do
just drop the atomicity until you implement proper threading support.
================
Comment at: source/API/SBReproducerPrivate.h:573-574
+
+ void SetSerializer(SBSerializer &serializer) { m_serializer = &serializer; }
+
+ /// Records a single function call.
----------------
Move this to the constructor, and use `llvm::Optional` in the `SB_RECORD`
macros?
This will also avoid having this class muck around with `g_global_boundary`
when we are not recording.
================
Comment at: source/API/SBReproducerPrivate.h:578
+ void Record(Result (*f)(FArgs...), const RArgs &... args) {
+ if (!ShouldCapture()) {
+ return;
----------------
Inconsistent braces around a single statement.
================
Comment at: source/API/SBReproducerPrivate.h:584-585
+
+ if (Log *log = GetLogIfAllCategoriesSet(LIBLLDB_LOG_API))
+ log->Printf("Recording #%u '%s'", id, m_pretty_func.data());
+
----------------
if you used the `LLDB_LOG` macro, you wouldn't have to rely on m_pretty_func
being implicitly null terminated.
================
Comment at: source/API/SBReproducerPrivate.h:618-626
+ void RecordOmittedResult() {
+ if (m_result_recorded)
+ return;
+ if (!ShouldCapture())
+ return;
+
+ m_serializer->SerializeAll(0);
----------------
I'm wondering if we shouldn't embed some additional safeties here to make sure
the result is always recorded when it should be. It seems to be this class
should know whether it should expect an explicit `RecordResult` call (based on
whether we called the `void` version of `Record` or not). So, we could have
this class assert if we reached the destructor without recording anything,
instead of silently recording zero. WDYT?
================
Comment at: source/API/SBReproducerPrivate.h:659
+ if (auto *g = lldb_private::repro::Reproducer::Instance().GetGenerator()) {
\
+ lldb_private::repro::SBRecorder sb_recorder(__PRETTY_FUNCTION__);
\
+ sb_recorder.SetSerializer(
\
----------------
`__PRETTY_FUNCTION__` is not portable. I think you want LLVM_PRETTY_FUNCTION
here.
================
Comment at: tools/driver/Driver.cpp:913-917
+ SBReproducer reproducer;
+ if (reproducer.Replay()) {
+ return 0;
+ }
+
----------------
The driver should know whether it is replaying things or not, right? Can we
have it call `Replay` only if replay mode has actually been requested?
================
Comment at: unittests/API/CMakeLists.txt:4-12
+ LINK_LIBS
+ liblldb
+ lldbCore
+ lldbHost
+ LINK_COMPONENTS
+ Support
+ )
----------------
Hmm.. I'm not sure this is going to work everywhere, but it's certainly going
to be weird. `liblldb` is a shared library which deliberately firewalls all
`lldb_private` symbols. So any reproducer symbols you defined there will not be
accessible to your unit tests. I guess the reason `include_directories` trick
kind of works is that most of the reproducer frameworks is implemented in the
header, and including that header from the test will cause another copy of all
of these symbols to be defined. However, I am still surprised that it does
work, as you still have some chunks of this implemented in the cpp file. I
guess you were just lucky to not need those symbols in the tests that you have
written. Nonetheless, this is going into very murky waters, and I think the
fact that you had to link in lldbHost and lldbCore in addition to liblldb
(even though they are be included in there) demonstrates it.
I think the best way to address this is to move the core of the repro engine
into some other library, which can be safely used from unit tests. After all,
the core is not really tied to the SB API, and you could easily use it to
record any other interface, if you needed to do that for whatever reason. I
think this should go into the Utility module, as it doesn't have any other
dependencies. I see that you're including FileSystem.h, but it looks like this
is only used in the `SetInputFileHandleRedirect` function, which is not
actually used anywhere! Even when it is used, I think it would make sense for
it to live elsewhere, as this is not a part of the replay core, but rather how
a particular api wants to replay a particular function signature.
================
Comment at: unittests/API/SBReproducerTest.cpp:37-41
+ 0x01, 0x00, 0x61, 0xcd, 0xcc, 0x8c, 0x3f, 0x02, 0x00, 0x00, 0x00,
+ 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x05, 0x00, 0x62, 0x06, 0x00, 0x00,
+ 0x00, 0x07, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x08, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09, 0x00};
----------------
All of these buffers hard-code endiannes of integers and the representation of
floats.
================
Comment at: unittests/API/SBReproducerTest.cpp:83-85
+ SBDeserializer deserializer("a");
+ EXPECT_TRUE(deserializer.HasData());
+ EXPECT_FALSE(deserializer.HasData(1));
----------------
I find this behaviour confusing. Intuitively, I'd expect `HasData(n)` if buffer
has at least `n` bytes of data left. So in this case I'd expect `true` in both
calls. (In fact I'd probably expect `HasData()` to be a synonym for
`HasData(1)`). It looks like you're only calling `HasData` in one place, and
you're hardcoding `1` there, so maybe you just want an `EOF()` function?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D56322/new/
https://reviews.llvm.org/D56322
_______________________________________________
lldb-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits