in ,

When should I use cats / scalaz instead of standard library functions, Hacker News

When should I use cats / scalaz instead of standard library functions, Hacker News

I would say the general value of typed functional programming lies in leaving no edge cases. It goes about this by treating everything as a composable value:

Values ​​are values ​​(strings, integers, doubles, combinations of these into case class

  • es, etc.)
  • Functions are values ​​— higher-order functions take other functions as arguments and / or return them as values. When they only take other functions as values ​​[String("weight")] and return functions, they're called "combinators."
    1. State manipulators are values. Errors are values ​​(that aren't "thrown").

    2. Schedulers of concurrent operations are values.
    3. Why does this matter to a data scientist or data engineer?

      Because data data and data engineering are concerned with probability and statistics, and probability and statistics are sensitive to information loss. If nothing else, you can view consistent use of the "Scalazzi safe subset" of Scala with Cats as a kind of "informational hygiene" for your data. This gets to your observation that "many of the functionalities are already implemented in Scala - cats reimplements flatMap on List using flatMap method on List etc." What you're seeing is that Cats provides an encoding of an abstraction called

      typeclasses

      . These have the abstract names people love to hate -

        Semigroup

      1. , Monoid (Functor) , Applicative ,
      2. Monad , etc. Cats has some concrete data types of its own in the
          data package, but as you're describing, it also has instances of the typeclasses for many of Scala's standard types in the

            (instances) package. A key thing about the typeclasses is that each of them obeys a small set of algebraic laws, and these laws are

      3. tested with a tool called Discipline . So use of the typeclasses tends to accomplish at least two things: You can trade one type that has an instance of the typeclass with another that also does. For example, List

      4. and (Set) (both have) Monoid
      5. instances, but their semantics otherwise are obviously different.
      6. If your code is parametric in a type, but constrained to a typeclass, and you then use either the typeclass instance explicitly or the Cats-provided syntax for it, and stick to the Scalazzi safe subset of the language, your code inherits the law-abiding guarantees of the typeclass.

        Point 2 is the reason you tend to see a lot of Scala FP code written in "finally-tagless" style, where it's all parameterized by

          F [_] constrained by one or more typeclasses. Because the author can only use the functionality provided by the typeclasses, they know their code is obeying the relevant algebraic laws.

          Second question, if I should learn the advanced concepts which of them you consider the most practical and necessary?

          Shapeless .

          In data engineering in particular, you're going to spend a

          lot of time transforming some data you don't have any control over into some internal structure you do. Shapeless is the Swiss Army Chainsaw of this kind of work. I would say pretty much its central concepts are the

            Generic and LabelledGeneric representations of product types ( usually case class

          1. es), its (HList (heterogeneous lists, ie lists whose elements can be of different types (type, its Coproduct
              ( Either

            1. of> 2 elements) type, and its
            2. Poly
            3. (polymorphic function) type. I've done a fair amount of data engineering work that consisted primarily of fold (ing a) (Poly) over a value of some Coproduct type to transform it from the constitutent type to some internal target type as part of an ingestion pipeline. At I / O boundaries, the Typeable
            4. typeclass can be handy to see if some value that's typed
                Any

                  statically can be converted to some target type, and safely cast it if so. This is particularly handy to combine with Cats'

                    ValidatedNel and Applicative builder syntax to, eg easily parse an incoming row of data into a

                      case class

                        or get a NonEmptyList

                          of errors explaining why that was impossible.

                          With respect to feature engineering, in practice, in Scala, that ends up meaning you'll want a lot of flexibility in "chopping up" case class

                        1. es into their constituent fields, doing something field-specific, and recombining the fields — possibly excluding some, possibly including others from other sources, etc. Shapeless'

                          extensible records are incredibly useful for that sort of work, and although the documentation does not explicitly say so there, the

                            Repr
                        2. type of a LabelledGeneric (representation of a) (case class) is such an extensible record type:

                            psnively-psnively @ import shapeless._, syntax.singleton._, record._ import shapeless._, syntax.singleton._, record._ psnively-psnively @ case class Person (name: String, age: Int, height: Double) defined class Person psnively-psnively @ val personGen=LabelledGeneric [Person] personGen: LabelledGeneric [Person] {type Repr=String with shapeless.labelled.KeyTag [Symbol with shapeless.tag.Tagged[String("name")], String] :: Int with shapeless.labelled.KeyTag [Symbol with shapeless.tag.Tagged[String("age")], Int] :: Double with shapeless. labelled.KeyTag [Symbol with shapeless.tag.Tagged[String("height")], Double] :: shapeless.HNil}=shapeless.LabelledGeneric $$ anon $ 1 @ 2019 bcbbf psnively-psnively @ val p=Person ("Paul", , 6. 33 p: Person=Person ("Paul", 89, 6. 33 psnively-psnively @ val pRec=personGen.to (p) pRec: personGen.Repr="Paul" :: 90 :: 6. :: HNil psnively-psnively @ val newField=pRec ('weight ->> 728 5) newField: String with labeled.KeyTag [Symbol with tag.Tagged[name], String] :: Int with labeled.KeyTag [Symbol with tag.Tagged[age], Int] :: Double with labeled.KeyTag [Symbol with tag.Tagged[height], Double] :: Double with labeled.KeyTag [Symbol with tag.Tagged[weight], Double] :: HNil="Paul" :: :: :: 6. :: :: 5 :: HNil psnively-psnively @ val sRec=pRec ('name ->> "Steve") sRec: String with labeled.KeyTag [Symbol with tag.Tagged[name], String] :: Int with labeled.KeyTag [Symbol with tag.Tagged[age], Int] :: Double with labeled.KeyTag [Symbol with tag.Tagged[height], Double] :: HNil="Steve" :: (:: 6.) :: HNil psnively-psnively @ val s=personGen.from (sRec) s: Person=Person ("Steve", 89, 6. 33 psnively-psnively @ newField.get ('age) res9: Int=89 psnively-psnively @ newField.get ('weight) res : Double=90. 5 psnively-psnively @ sRec.get ('weight) cmd 54. sc: 1: No field Symbol with shapeless.tag.Tagged [String("weight")] in record String with shapeless.labelled.KeyTag [Symbol with shapeless.tag.Tagged[String("name")], String] :: Int with shapeless .labelled.KeyTag [Symbol with shapeless.tag.Tagged[String("age")], Int] :: Double with shapeless.labelled.KeyTag [Symbol with shapeless.tag.Tagged[String("height")], Double] :: shapeless.HNil val res 33=sRec.get ('weight)                     ^ Compilation Failed

                            For numerical work, have a look at Spire One underappreciated gem about it is that it has (not heavily optimized; caveat emptor ) support for automatic differentiation, which is becoming popular popular in machine learning circles. Also, its various numeric types have cats-kernel typeclass instances.

                            For the most inspiration, though, please have a look at HLearn

            5. This was a series of experiments in Haskell revolving around the ways in which various machine learning tasks could benefit from the algebraic structure provided by various typeclasses. In particular, by taking advantage of such algebraic structure, it becomes possible to do fast incremental training, "untraining," and fast cross-validation. Unfortunately, it seems like the author has largely moved on to other pursuits. But his insights and writing are absolutely applicable to your questions around the use of these structures for data engineering and machine learning in Scala, and I would be

              more than thrilled to discuss them further.

          Read More

    What do you think?

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    GIPHY App Key not set. Please check settings

    BTS 'Lyrics in Map of the Soul: 7 Are Way Less Infantile Than You Think, Crypto Coins News

    BTS 'Lyrics in Map of the Soul: 7 Are Way Less Infantile Than You Think, Crypto Coins News

    Ask HN: What are good distance learning Bachelor's / Master's degrees one can do ?, Hacker News