Questions tagged [scala]

Scala is a general-purpose programming language principally targeting the Java Virtual Machine. Designed to express common programming patterns in a concise, elegant, and type-safe way, it fuses both imperative and functional programming styles. Its key features are: an advanced static type system with type inference; function types; pattern-matching; implicit parameters and conversions; operator overloading; full interoperability with Java; concurrency

0
votes
0answers
7 views

Does maven-shade-plugin work with scala classes?

I have a maven project with both Java and Scala components, but when I use maven-shade-plugin, it relocates package names for both Java and Scala files, but ONLY renames packages inside Java files, ...
-1
votes
0answers
18 views

Apache-Spark: action after groupBy (with filter) cause TimeoutExc/OOM

I use scala 2.11.11 & spark 2.3 and want to process a large rdd (~25GB) but got with // [example code] val conf= new SparkConf().setMaster("local[4]").setAppName("BatchExample") val spark = ...
0
votes
1answer
6 views

FlinkCEP: Can I reference an earlier event to define a subsequent match?

Here is a simple example: val pattern = Pattern.begin[Event]("start").where(_.getId == 42). next("middle").subtype(classOf[SubEvent]).where(x => x.getVolume == **first event matched**....
-2
votes
0answers
9 views

Writing RDD to hive table in remote cluster which has json in it

I have rdd which am trying to write into hive table which is located in remote cluster. myrdd.foreach(println) gives 2cbeb5cb-219c-4a84-b0b1-fa13de0cbbd4,abc,2019-03-22 17:24:17.484,xyz,some,N,{"a":"...
0
votes
1answer
18 views

spark partitioning breaks the lazy evaluation chain and triggers error which I cannot catch

When doing the re-partitioning spark is breaking the lazy-evaluation-chain and triggers the error which I cannot control/catch. //simulation of reading a stream from s3 def readFromS3(partition: Int)...
-1
votes
0answers
15 views

Null Pointer exception on performing an action on RDD [duplicate]

I am trying to populate a table structure using Dataframes. Here column of the table is initially being represented by a dataframe which would be in turn filled by a scala stream backed by a regular ...
0
votes
0answers
8 views

What is the order to close the SMBJ share, session, client?

This is my first time using SMBJ library for connecting to NAS drive. I am getting following stack trace error while trying to close the share, connection, session, client in the following order. ...
0
votes
0answers
26 views

Scala - predicate inside filter [duplicate]

Suppose this very simple example: def complexMethod (first : Int, second : Int) : Boolean = { //heavy stuff goes in here true } val myList : List[Int] = List(1,2,3) val newList : List[Int] = ...
1
vote
0answers
40 views

Discarded non-unit value for Unit declaration

Can someone please explain what this means Error:(32, 28) discarded non-Unit value dataFrameReader.load() wasCalled once I've looked at some online articles and I don't quite understand it. ...
-3
votes
0answers
16 views

Java.lang.StackOverflowError error while building sentiment analysis model on Spark2

We are trying to perform sentiment analysis on twitter data. When we run the code above we get java.lang.StackOverflowError error. Can anyone figure out what's the cause of this error? https://i....
0
votes
0answers
33 views

Random order of transformation execution issue

We have an issue with order of execution of Spark transformations that seems to be arbitrary. We have 2 RDDs with related events that we classify. Multiple classifiers are applied to rdd1, but for ...
1
vote
1answer
28 views

efficient computation of haversine distance between elements of collections

I have two collections. Each collection is comprised of a collection containing a latitude, longitude, and epoch. val arr1= Seq(Seq(34.464, -115.341,1486220267.0), Seq(34.473, -115.452,1486227821.0),...
0
votes
0answers
17 views

How do I make sbt run a task only once even though it is indirectly specified multiple times on the cli?

Here is my plugin code. It defines a master lint task that is triggered from the CLI like so: sbt api/lint jobs/lint. It calls out to some project-specific linters and some build-wide linters. The ...
-4
votes
0answers
24 views

Spark scala sparkContext.addFile cannot retrieve file with sparkfiles.get

I'm trying to write a simple program to submit to a spark cluster, I need to download a json file stored in an http server and process it. The cluster is composed of the master node and 1 worker node, ...
0
votes
0answers
21 views

DropDuplicates is not giving expected result

I am working on a use-case of removing duplicate records from incoming structured data (in the form of CSV files within a folder on HDFS). In order to try this use-case, I wrote some sample code using ...
0
votes
0answers
14 views

spark dataframe still shows CRLF in windows notepad++

I am creating a tempview in spark using df.createOrReplaceTempView function. After creating the view, I am applying a sql on the last column to remove the carriage return. Given below is a sample code....
-2
votes
0answers
54 views

What is the difference between library, framework and User defined functions in programming? [on hold]

I am new to programming. I am fully into db background. So, I had got confusion with these terms like library, framework and user defined functions in programming. So, can anyone please tell about it.
0
votes
1answer
16 views

Checking if elements of a tweets array contain one of the elements of positive words array and count

We are building sentiment analysis application and we converted our tweets dataframe to an array. We created another array consisting of positive words. But we cannot count the number of tweets ...
0
votes
0answers
15 views

Reading Avro file with schema into Spark Dataset using CASE class

I'm struggling to find suitable example on reading Avro file into Spark Dataset using Case class. I would like to use spark dataset rather than Data frame. I would like to maintain my Avro schema in ...
1
vote
3answers
41 views

How to retrieve derived classes as is from a Map?

I have to retrieve Derived class objects stored in a Map given the respective class name as key. As show below trait Caluclator class PreScoreCalculator extends Caluclator class BenchMarkCalculator ...
0
votes
0answers
13 views

Databricks fails accessing a Data Lake Gen1 while trying to enumerate a directory

I am using (well... trying to use) Azure Databricks and I have created a notebook. I would like the notebook to connect my Azure Data Lake (Gen1) and transform the data. I followed the documentation ...
0
votes
3answers
50 views

In scala List :: doesn't do implicit conversion

In scala, it is OK to convert a variable in the Seq, but if I construct the Seq with :: it doesn't work. For example case class A(s: String) implicit def toA(s: String): A = A(s) val Seq(a, b, ...
0
votes
0answers
20 views

Bitwise ORing select columns in Spark Dataframe

I have a spark dataframe dist with following schema: +-----+-----+-----+-----+-----+ | id1 | id2 | _c1 | _c2 | _c3 | +-----+-----+-----+-----+-----+ | int | int | bin | bin | bin | +-----+-----+-----+...
-3
votes
0answers
32 views

How to find correlation between two columns of data frame in spark Scala?

I am working with a big DataFrame. But I am trying to get correlation between two columns. I used this code in Scala: val corr = df.stat.corr("BEN","O_3")
1
vote
0answers
15 views

Scala, couchbase - convert AsyncN1qlQueryResult into custom object

I have a case class with simple data: case class MyClass( details: Details, names: List[String], id: String, ) I have created a couchbase query which should retrieve all documents from ...
2
votes
1answer
16 views

Getting an Error JsValue Expected but JsNode Supplied

I am using Play 2.5 and I need advices on a compile error. It is saying that I have given JsNode value but JsValue expected. I don't know what is wrong. val response : Future[Result] = ...
1
vote
1answer
24 views

How can I use the last result from a scala map as input to the next function?

I'm working through some project euler questions to practice my scala. For problem 7 I have to find the 10001st prime. I have a working solution, but dont feel its as functional as it could be. def ...
1
vote
0answers
26 views

How to connect array of vectors to dataframe in spark?

I have: clustercenters=model.clusterCenters from a kmeans model in org.apache.spark.ml.clustering.KMeans The result is Array[org.apache.spark.ml.linalg.Vector] I want to convert this in a dataframe ...
1
vote
1answer
28 views

Add two flags in Maven tag to SkipTests

I´m using a ScalaTest plugin to execute some IT test. I need to skip all the test(Unit, IT) in some pipelines, and they only use the flag ${skipTests} so I need to use that flag for all my types of ...
1
vote
1answer
34 views

OpenPDF/iText corrupt documents

I've been trying to re-implement the concatenate example from OpenPDF 1.2.4 and 1.2.11 in Scala: def mergePdfs(docs: Seq[Array[Byte]]): Array[Byte] = { log.debug(s"merging ${docs.size} PDFs") ...
0
votes
1answer
27 views

Java class type mismatch in scala?

I am trying to import the Color class from javafx, and use it in a scala class. But when I use my class somewhere else, I get this error: Error:(9, 50) type mismatch; found : javafx.scene.paint....
1
vote
1answer
18 views

unable to create Configuration object

I want to create the following configuration in code. mail{ signupUrl ="/test/ws/users/signup/" signupFrom="Test<mailrobot@test.com>" signupReply="Test<noreply@test.comm>" } I ...
0
votes
0answers
20 views

Flink Session Window (based on EventTime) with Expiry Time? [on hold]

I have a stream of events with an "id" and a "timestamp" field. From this stream, how can I filter out the "id"s, for which I'm receiving events continuously, for more than 3 minutes, using Flink?
0
votes
1answer
38 views

To print output of SparkSQL to dataframe

I'm currently running Analyze command for particular table and could see the statistics being printed in the Spark-Console However when I try to write the output to a DF I could not see the ...
-3
votes
1answer
29 views

How to covert Scala List(immutable.List) in Java List [duplicate]

Some schema I have to use defined in scala package and use scala.immutable.List[Jsobject] I want to covert this container to java.util.List, What should I do? This is my first question! :)
0
votes
0answers
9 views

Spark data frame write to Elastic Search using es-hadoop connector with aws role based signing

I am seeking for a connector/client in scala which sends signed requests to AWS elastic search. I've already used "org.elasticsearch" % "elasticsearch-hadoop" % "6.3.0". This sends requests to aws ...
-3
votes
0answers
19 views

Spark-Scala application in Template Design Pattern

I am developing a Spark-Scala application, in which I am planning to use Template Design Pattern. Here is the proposed design. ProjectTemplate.scala => This is a trait containing functions such as ...
0
votes
1answer
25 views

append xml node if it not present in the list

I have block of code: object XmlExample { def main(args: Array[String]): Unit = { val someXml = <books> <book title="The Woman in White"> <author>Wilkie ...
0
votes
0answers
33 views

How to ignore some classes during scala compilation

I am running sbt in my console where i can enter commands compile run test ... But as a default these commands always compile every single class they see in your project. The thing is sometimes you ...
0
votes
1answer
14 views

Access Play Framework Router routing table at run time

I am writing an EssentialFilter so I can perform an operation on every request. However all that the filter receives is a RequestHeader object and I need to know information about the actual ...
-2
votes
0answers
26 views

How to manage writing into csv file with header dynamically

val rdd = df.rdd.map(line => Row.fromSeq( "BNK" :: format.format(Calendar.getInstance().getTime()) :: line(0) :: scala.xml.XML.loadString("<?xml version='1.0' ...
0
votes
1answer
24 views

MapReduce Spark data processing with rdd (scala)

I have a big data and i want to use mapRuduce on this data and i don't find anything for this task. (Language:Scala) The data for this process is: Y,20,01 G,18,40 J,19,10 D,50,10 R,20,01 Z,18,40 T,...
1
vote
1answer
19 views

Is it possible to get warning when new version of internal library?

Is there a sbt plugin or piece of code which I could use to give me warning when I compile using outdated internal libraries ? Let's say another team have updated their library but I am still using ...
0
votes
1answer
11 views

The generated route files of play framework are re-generated automatically, even when there is no changes in conf/routes

We're working with play framework 2.6.21 (but we're able to reproduce it with play 2.6.19 ). Sometimes (the problem appears quite randomly), our play application starts to recompile and reload on ...
0
votes
0answers
31 views

How RDD take() method works internally?

I understand that take(n) will return n elements of an RDD, but how Spark decides from which partition to call those elements from and which elements should be chosen? Does it maintain indexes ...
0
votes
1answer
15 views

Checkbox lists in Play with Scala

I have ... a Seq[RoleId] defining all roles that a user can get granted a Userwith a property roles: Seq[RoleId], where the roles are those that the user has got granted a Play controller preparing a ...
0
votes
0answers
28 views

How to make EitherT[Future, String, Int] from Future[Either[String, Int]] with cats?

I have this code: type Response[A] = EitherT[Future, String, A] val powerLevels = Map( "Jazz" -> 6, "Bumblebee" -> 8, "Hot Rod" -> 10 ) def getPowerLevel(autobot: String)...
0
votes
1answer
14 views

Mongo Scala Play - java.lang.NoSuchMethodError: com.mongodb.ConnectionString.getApplicationName()Ljava/lang/String;]

I'm trying to do a basic CRUD on a scala play mongo prototype. The code works as a standalone main method - but when executed as a play application invoked through a controller, getting runtime ...
1
vote
0answers
18 views

Optimize Hive table loading time in spark

I am working on a spark streaming project in which incoming stream join with multiple Hive tables. So I am loading the table in spark and cache it. Below is the given sample query- val df = sql("...
0
votes
1answer
35 views

java.sql.Timestamp to Microseconds (Long with precision till Microseconds)

I have already looked at question from here and its solution, but looks like solution is from 2013 and I am looking for better solution than this. How to convert DateTime(TimeStamp) from Oracle to ...

http://mssss.yulina-kosm.ru