Skip to content

Avoiding cloning with pico #793

@rbalicki2

Description

@rbalicki2

Right now, we clone unnecessarily a lot! (look for .note_todo("Do not clone. Use a MemoRef."). There are 48 results. Yeesh.)

The issue is that pico used to not work with MemoRef's, and the alternative was extensive cloning. Now, we've caused it to work well with MemoRef, so we can go back and clone less.

I want to sketch the general plan for how we might minimize cloning for client selectables.

Principles

  • Functions whose result changes often (or whose results change superfluously) should not be called from functions that are expensive to run.

Client selectables

Client selectables are created from client declarations. (Client declarations are created from extractions, which are created from the source text, but that's not relevant). Let's assume for this issue that we want to keep that structure, though it can probably be improved. We want to minimize cloning and maximize reuse of values.

There are two needs:

  • to access an individual selectable, and
  • to iterate over every selectable

Accessing an individual selectable (identified by (entity name, selectable name), which we call a selectable_id) is not so easy. We don't know where it's defined, for example! (And it might be defined in multiple locations, invalidly).

So, working backwards, there must be a function that does the bare minimum of parsing of each input file. The result of this parsing is a client declaration. So

fn get_declaration_map(db) -> HashMap<SelectableId, Declaration> {
  db.get_all_file_contents().iter().map(...).collect()
}

(There are no Result<T, E> in this hash map! If we can't parse a given extraction, we simply ignore it. What else can we do? See below.)

Now, the result of calling this function will change superfluously. Changes to unrelated files will cause it to have a different value. So, we need an intervening function to cut down on the superfluous changes:

fn get_declarations(db, selectable_id) -> 
  Option<MemoRef<Vec<Declaration>>>
{
  db.intern_ref(get_declaration_map(db).get(selectable_id))
}

Okay, great, the results of this function will only change if the declaration has changed. That's good!

But, what should the return value be? Should it be a MemoRef? Alas no. MemoRef seems like the right move, but it's actually wrong, because if the map changes, we will have to return a new memo ref, i.e. it doesn't allow us to short circuit very well. But that's what we want!

So, we probably want to clone. :/

fn get_declarations(db, selectable_id) -> 
  Option<Vec<Declaration>>
{
  get_declaration_map(db).get(selectable_id).cloned()
}

So, we have to keep at least two copies of each declaration in memory. Not the end of the world, but really, not ideal. What else can we do here?

  • we can try to make the "parse the minimal amount of everything" step even more minimal.

Anyway, to be continued.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions