Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure into spec to better fit possible usages #64

Open
5 tasks
stanislowskij opened this issue Sep 11, 2024 · 4 comments
Open
5 tasks

Restructure into spec to better fit possible usages #64

stanislowskij opened this issue Sep 11, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@stanislowskij
Copy link
Collaborator

stanislowskij commented Sep 11, 2024

into is simultaneously under-spec'd and over-spec'd right now due to some weird edge cases that come with the function. Officially, the documentation notes that, with two arguments, into should take two collections, but there are several cases where this is not true.

  • If to is any value or expression, and from is an empty collection, (into to from) returns evaluation of to.
user=> (into :a [])
:a

⚠️ Incorrectly fails Babel spec:

babel.middleware=> (into :a [])
The first argument of (into :a []) was expected to be a sequence but is a keyword :a instead.
  • If from is a non-empty collection, (into nil from) performs conj on (list (first from)) and the rest of the list, regardless of what the type of collection from is.
user=> (into nil [:a :b])
(:b :a)

Same result in Babel.

  • If to is a string, and from is a non-empty string, (into to from) throws an exception.
user=> (into "" "test")
Execution error (ClassCastException) at user/eval1579 (REPL:1).
class java.lang.String cannot be cast to class clojure.lang.IPersistentCollection (java.lang.String is in module java.base of loader 'bootstrap'; clojure.lang.IPersistentCollection is in unnamed module of loader 'bootstrap')

Correctly fails Babel spec, though might need to be improved, since the message is wrong (see below):

babel.middleware=> (into "" "test")
The first argument of (into "" "test") was expected to be a sequence but is a string "" instead.
  • Note that this still works if from is an empty string:
user=> (into "a" "")
"a"

⚠️ Incorrectly fails Babel spec:

babel.middleware=> (into "a" "")
The first argument of (into "a" "") was expected to be a sequence but is a string "a" instead.

  • Note also that Babel specs incorrectly pass if to is a collection and from is not. In these cases, we get an error on seq instead (which is also a problem. See: Need spec on seq #63).
babel.middleware=> (into [1 2 3] 4)
Don't know how to create a sequence from a number.
@stanislowskij stanislowskij added the bug Something isn't working label Sep 11, 2024
@stanislowskij
Copy link
Collaborator Author

I will leave the internal definition of into here in case it's useful for analyzing, finding more cases / a better way to approach this:

(defn into
  "Returns a new coll consisting of to with all of the items of
  from conjoined. A transducer may be supplied.
  (into x) returns x. (into) returns []."
  {:added "1.0"
   :static true}
  ([] [])
  ([to] to)
  ([to from]
     (if (instance? clojure.lang.IEditableCollection to)
       (with-meta (persistent! (reduce conj! (transient to) from)) (meta to))
       (reduce conj to from)))
  ([to xform from]
     (if (instance? clojure.lang.IEditableCollection to)
       (let [tm (meta to)
             rf (fn
                  ([coll] (-> (persistent! coll) (with-meta tm)))
                  ([coll v] (conj! coll v)))]
         (transduce xform rf (transient to) from))
       (transduce xform conj to from))))

@stanislowskij
Copy link
Collaborator Author

stanislowskij commented Sep 11, 2024

This is our spec on into as it sits right now:

(s/fdef clojure.core/into
  :args (s/and :babel.arity/zero-to-three
               (s/or :arg-one (s/cat :any (s/? any?))
                     :arg-two (s/cat :coll (s/nilable :babel.type/coll) :any any?)
                     :arg-three (s/cat :coll (s/nilable :babel.type/coll) :function :babel.type/function-or-lazy :coll any?))))

@stanislowskij
Copy link
Collaborator Author

stanislowskij commented Sep 12, 2024

@elenam Here are some discoveries I made today while diving down this rabbit hole, which may influence how we rewrite these specs going forward. There are two key invariants on conj and reduce respectively which categorize all of these examples into two "cases" of unexpected behavior.

  • (into to from) is functionally equivalent to (reduce conj to from) for TWO ARGS ONLY. There are nuances with zero or one arguments that produce inconsistent behavior.

What does (reduce conj to from) actually do?

Case I: conj has this invariant which produces interesting behavior when called with reduce.

From documentation:

conj[oin]. Returns a new collection with the xs
'added'. (conj nil item) returns (item).
(conj coll) returns coll. (conj) returns [].
The 'addition' may happen at different 'places' depending
on the concrete type.

(conj nil item) $\rightarrow$ (list item), NOT (seq item) which is an important distinction.

Therefore, we can expect that:

(into nil x) = (reduce conj nil x) $\rightarrow$ (reduce conj (list (first x)) (drop 1 x))

i.e.,

(into nil "hello")
$\rightarrow$ (reduce conj '(\h) "ello")
$\rightarrow$ (reduce conj '(\e, \h) "llo")
...
$\rightarrow$ (\o, \l, \l, \e, \h)


Case II: reduce also has its own interesting invariant.

From documentation:

f should be a function of 2 arguments. If val is not supplied,
returns the result of applying f to the first 2 items in coll, then
applying f to that result and the 3rd item, etc. If coll contains no
items, f must accept no arguments as well, and reduce returns the
result of calling f with no arguments.
If coll has only 1 item, it
is returned and f is not called. If val is supplied, returns the
result of applying f to val and the first item in coll, then
applying f to that result and the 2nd item, etc. If coll contains no
items, returns val and f is not called.

It's not stated here, but I suspect that, by "contains no items," they actually mean that (count coll) returns 0, since using "" and nil in place of an empty collection works the exact same.

Also, this invariant applies to cases with two arguments, in which coll is effectively ignored.

i.e., define empty such that (any? empty) and (= 0 (count empty)). Then:

With one argument: (into empty) = (reduce conj empty) $\rightarrow$ (conj) $\rightarrow$ []

With two arguments: (into x empty) = (reduce conj x empty) $\rightarrow$ (conj x) $\rightarrow$ x

@stanislowskij
Copy link
Collaborator Author

One proposed idea for spec on into:

one-argument: any?

two-arguments: ((first: coll?) and (second: (seqable? and non-empty?)))
or
((first: any?) and second: (seqable? and empty?))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants